Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sintefneodroid/agent
Examples of agents for neodroid environments 💡
https://github.com/sintefneodroid/agent
agent deep-learning dqn droid estimate game-development hacktoberfest imitation-learning machine-learning ml neo neodroid obstructions ppo python pytorch reinforcement-learning rl unity unity3d
Last synced: 3 months ago
JSON representation
Examples of agents for neodroid environments 💡
- Host: GitHub
- URL: https://github.com/sintefneodroid/agent
- Owner: sintefneodroid
- License: apache-2.0
- Created: 2017-07-25T07:39:43.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2024-10-14T20:08:26.000Z (4 months ago)
- Last Synced: 2024-10-22T20:04:23.114Z (3 months ago)
- Topics: agent, deep-learning, dqn, droid, estimate, game-development, hacktoberfest, imitation-learning, machine-learning, ml, neo, neodroid, obstructions, ppo, python, pytorch, reinforcement-learning, rl, unity, unity3d
- Language: Python
- Homepage: https://pypi.org/project/Neodroid/
- Size: 22.1 MB
- Stars: 9
- Watchers: 5
- Forks: 6
- Open Issues: 14
-
Metadata Files:
- Readme: README.md
- Contributing: .github/CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE.md
- Code of conduct: .github/CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
Agent
This repository will host all initial machine learning efforts applying the [Neodroid](https://github.com/sintefneodroid/) platform.
---
_[Neodroid](https://github.com/sintefneodroid) is developed with support from Research Council of Norway Grant #262900. ([https://www.forskningsradet.no/prosjektbanken/#/project/NFR/262900](https://www.forskningsradet.no/prosjektbanken/#/project/NFR/262900))_
---
| [![Build Status](https://travis-ci.org/sintefneodroid/agent.svg?branch=master)](https://travis-ci.org/sintefneodroid/agent) | [![Coverage Status](https://coveralls.io/repos/github/sintefneodroid/agent/badge.svg?branch=master)](https://coveralls.io/github/sintefneodroid/agent?branch=master) | [![GitHub Issues](https://img.shields.io/github/issues/sintefneodroid/agent.svg?style=flat)](https://github.com/sintefneodroid/agent/issues) | [![GitHub Forks](https://img.shields.io/github/forks/sintefneodroid/agent.svg?style=flat)](https://github.com/sintefneodroid/agent/network) | [![GitHub Stars](https://img.shields.io/github/stars/sintefneodroid/agent.svg?style=flat)](https://github.com/sintefneodroid/agent/stargazers) |[![GitHub License](https://img.shields.io/github/license/sintefneodroid/agent.svg?style=flat)](https://github.com/sintefneodroid/agent/blob/master/LICENSE.md) |
|---|---|---|---|---|---|
# Contents Of This Readme
- [Algorithms](#algorithms)
- [Requirements](#requirements)
- [Usage](#usage)
- [Results](#results)
- [Target Point Estimator](#target-point-estimator)
- [Perfect Information Navigator](#perfect-information-navigator)
- [Contributing](#contributing)
- [Other Components](#other-components-of-the-neodroid-platform)# Algorithms
- [REINFORCE (PG)](agent/agents/model_free/policy_optimisation/pg_agent.py)
- [DQN](agent/agents/model_free/q_learning/dqn_agent.py)
- [DDPG](agent/agents/model_free/hybrid/ddpg_agent.py)
- [PPO](agent/agents/model_free/hybrid/ppo_agent.py)
- TRPO, GA, EVO, IMITATION...## **Algorithms Implemented**
1. *Deep Q Learning (DQN)* ([Mnih et al. 2013](https://arxiv.org/pdf/1312.5602.pdf))
1. *DQN with Fixed Q Targets* ([Mnih et al. 2013](https://arxiv.org/pdf/1312.5602.pdf))
1. *Double DQN (DDQN)* ([Hado van Hasselt et al. 2015](https://arxiv.org/pdf/1509.06461.pdf))
1. *DDQN with Prioritised Experience Replay* ([Schaul et al. 2016](https://arxiv.org/pdf/1511.05952.pdf))
1. *Dueling DDQN* ([Wang et al. 2016](http://proceedings.mlr.press/v48/wangf16.pdf))
1. *REINFORCE* ([Williams et al. 1992](http://www-anw.cs.umass.edu/~barto/courses/cs687/williams92simple.pdf))
1. *Deep Deterministic Policy Gradients (DDPG)* ([Lillicrap et al. 2016](https://arxiv.org/pdf/1509.02971.pdf) )
1. *Twin Delayed Deep Deterministic Policy Gradients (TD3)* ([Fujimoto et al. 2018](https://arxiv.org/abs/1802.09477))
1. *Soft Actor-Critic (SAC & SAC-Discrete)* ([Haarnoja et al. 2018](https://arxiv.org/pdf/1812.05905.pdf))
1. *Asynchronous Advantage Actor Critic (A3C)* ([Mnih et al. 2016](https://arxiv.org/pdf/1602.01783.pdf))
1. *Syncrhonous Advantage Actor Critic (A2C)*
1. *Proximal Policy Optimisation (PPO)* ([Schulman et al. 2017](https://openai-public.s3-us-west-2.amazonaws.com/blog/2017-07/ppo/ppo-arxiv.pdf))
1. *DQN with Hindsight Experience Replay (DQN-HER)* ([Andrychowicz et al. 2018](https://arxiv.org/pdf/1707.01495.pdf))
1. *DDPG with Hindsight Experience Replay (DDPG-HER)* ([Andrychowicz et al. 2018](https://arxiv.org/pdf/1707.01495.pdf) )
1. *Hierarchical-DQN (h-DQN)* ([Kulkarni et al. 2016](https://arxiv.org/pdf/1604.06057.pdf))
1. *Stochastic NNs for Hierarchical Reinforcement Learning (SNN-HRL)* ([Florensa et al. 2017](https://arxiv.org/pdf/1704.03012.pdf))
1. *Diversity Is All You Need (DIAYN)* ([Eyensbach et al. 2018](https://arxiv.org/pdf/1802.06070.pdf))## **Environments Implemented**
1. *Bit Flipping Game* (as described in [Andrychowicz et al. 2018](https://arxiv.org/pdf/1707.01495.pdf))
1. *Four Rooms Game* (as described in [Sutton et al. 1998](http://www-anw.cs.umass.edu/~barto/courses/cs687/Sutton-Precup-Singh-AIJ99.pdf))
1. *Long Corridor Game* (as described in [Kulkarni et al. 2016](https://arxiv.org/pdf/1604.06057.pdf))
1. *Ant-{Maze, Push, Fall}* (as desribed in [Nachum et al. 2018](https://arxiv.org/pdf/1805.08296.pdf) and their accompanying [code](https://github.com/tensorflow/models/tree/master/research/efficient-hrl))# Requirements
- pytorch
- tqdm
- Pillow
- numpy
- matplotlib
- torchvision
- torch
- Neodroid
- pynput(Optional)
- visdom
- gymTo install these use the command:
````bash
pip3 install -r requirements.txt
````# Usage
Export python path to the repo root so we can use the utilities module
````bash
export PYTHONPATH=/path-to-repo/
````For training a agent use:
````bash
python3 procedures/train_agent.py
````For testing a trained agent use:
````bash
python3 procedures/test_agent.py
````# Results
## Target Point Estimator
Using Depth, Segmentation And RGB images to estimate the location of target point in an environment.
### [REINFORCE (PG)](agent/agents/model_free/policy_optimisation/pg_agent.py)
### [DQN](agent/agents/model_free/q_learning/dqn_agent.py)
### [DDPG](agent/agents/model_free/hybrid/ddpg_agent.py)
### [PPO](agent/agents/model_free/hybrid/ppo_agent.py)
### GA, EVO, IMITATION...
## Perfect Information Navigator
Has access to perfect location information about the obstructions and target in the environment, the objective is to navigate to the target with colliding with the obstructions.
### [REINFORCE (PG)](agent/agents/model_free/policy_optimisation/pg_agent.py)
### [DQN](agent/agents/model_free/q_learning/dqn_agent.py)
### [DDPG](agent/agents/model_free/hybrid/ddpg_agent.py)
### [PPO](agent/agents/model_free/hybrid/ppo_agent.py)
### GA, EVO, IMITATION...
# Contributing
See guidelines for contributing [here](.github/CONTRIBUTING.md).
# Licensing
This project is licensed under the Apache V2 License. See [LICENSE](LICENSE.md) for more information.
# Citation
For citation you may use the following bibtex entry:
````
@misc{neodroid-agent,
author = {Heider, Christian},
title = {Neodroid Platform Agents},
year = {2018},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/sintefneodroid/agent}},
}
````# Other Components Of the Neodroid Platform
- [neo](https://github.com/sintefneodroid/neo)
- [droid](https://github.com/sintefneodroid/droid)# Authors
* **Christian Heider Nielsen** - [cnheider](https://github.com/cnheider)
Here other [contributors](https://github.com/sintefneodroid/agent/contributors) to this project are listed.