Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/heronsystems/adeptrl
Reinforcement learning framework to accelerate research
https://github.com/heronsystems/adeptrl
actor-critic artificial-intelligence atari pysc2 pytorch reinforcement-learning starcraft2-ai
Last synced: 5 days ago
JSON representation
Reinforcement learning framework to accelerate research
- Host: GitHub
- URL: https://github.com/heronsystems/adeptrl
- Owner: heronsystems
- License: gpl-3.0
- Created: 2018-08-23T20:24:09.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2021-08-25T16:28:31.000Z (about 3 years ago)
- Last Synced: 2024-11-11T03:52:39.882Z (5 days ago)
- Topics: actor-critic, artificial-intelligence, atari, pysc2, pytorch, reinforcement-learning, starcraft2-ai
- Language: Python
- Homepage:
- Size: 2.61 MB
- Stars: 204
- Watchers: 21
- Forks: 29
- Open Issues: 20
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
![banner](images/banner.png)
adept is a reinforcement learning framework designed to accelerate research
by abstracting away engineering challenges associated with deep reinforcement
learning. adept provides:
* multi-GPU training
* a modular interface for using custom networks, agents, and environments
* baseline reinforcement learning models and algorithms for PyTorch
* built-in tensorboard logging, model saving, reloading, evaluation, and
rendering
* proven hyperparameter defaultsThis code is early-access, expect rough edges. Interfaces subject to change.
We're happy to accept feedback and contributions.### Read More
* [Installation](#installation)
* [Quickstart](#quickstart)
* [Features](#features)
* [Performance](#performance)### Documentation
* [Architecture Overview](docs/api_overview.md)
* [ModularNetwork Overview](docs/modular_network.md)
* [Resume training](docs/resume_training.md)
* Evaluate a model
* Render environment### Examples
* Custom Network ([stub](examples/custom_network_stub.py) | example)
* Custom SubModule ([stub](examples/custom_submodule_stub.py) | [example](adept/network/net1d/lstm.py))
* Custom Agent ([stub](examples/custom_agent_stub.py) | [example](adept/agent/actor_critic.py))
* Custom Environment ([stub](examples/custom_environment_stub.py) | [example](adept/env/openai_gym.py))## Installation
```bash
git clone https://github.com/heronsystems/adeptRL
cd adeptRL
pip install -e .[all]
```**From docker:**
* [docker instructions](./docker/)## Quickstart
**Train an Agent**
Logs go to `/tmp/adept_logs/` by default. The log directory contains the
tensorboard file, saved models, and other metadata.```bash
# Local Mode (A2C)
# We recommend 4GB+ GPU memory, 8GB+ RAM, 4+ Cores
python -m adept.app local --env BeamRiderNoFrameskip-v4# Distributed Mode (A2C, requires NCCL)
# We recommend 2+ GPUs, 8GB+ GPU memory, 32GB+ RAM, 4+ Cores
python -m adept.app distrib --env BeamRiderNoFrameskip-v4# IMPALA (requires ray, resource intensive)
# We recommend 2+ GPUs, 8GB+ GPU memory, 32GB+ RAM, 4+ Cores
python -m adept.app actorlearner --env BeamRiderNoFrameskip-v4# To see a full list of options:
python -m adept.app -h
python -m adept.app help
```**Use your own Agent, Environment, Network, or SubModule**
```python
"""
my_script.pyTrain an agent on a single GPU.
"""
from adept.scripts.local import parse_args, main
from adept.network import NetworkModule, SubModule1D
from adept.agent import AgentModule
from adept.env import EnvModuleclass MyAgent(AgentModule):
pass # Implementclass MyEnv(EnvModule):
pass # Implementclass MyNet(NetworkModule):
pass # Implementclass MySubModule1D(SubModule1D):
pass # Implementif __name__ == '__main__':
import adept
adept.register_agent(MyAgent)
adept.register_env(MyEnv)
adept.register_network(MyNet)
adept.register_submodule(MySubModule1D)
main(parse_args())
```
* Call your script like this: `python my_script.py --agent MyAgent --env
env-id-1 --custom-network MyNet`
* You can see all the args [here](adept/scripts/local.py) or how to implement
the stubs in the examples section above.## Features
### Scripts
**Local (Single-node, Single-GPU)**
* Best place to [start](adept/scripts/local.py) if you're trying to understand code.**Distributed (Multi-node, Multi-GPU)**
* Uses NCCL backend to all-reduce gradients across GPUs without a parameter
server or host process.
* Supports NVLINK and InfiniBand to reduce communication overhead
* InfiniBand untested since we do not have a setup to test on.**Importance Weighted Actor Learner Architectures, IMPALA (Single Node, Multi-GPU)**
* Our implementation uses GPU workers rather than CPU workers for forward
passes.
* On Atari we achieve ~4k SPS = ~16k FPS with two GPUs and an 8-core CPU.
* "Note that the shallow IMPALA experiment completes training over 200
million frames in less than one hour."
* IMPALA official experiments use 48 cores.
* Ours: 2000 frame / (second * # CPU core) DeepMind: 1157 frame / (second * # CPU core)
* Does not yet support multiple nodes or direct GPU memory transfers.### Agents
* Advantage Actor Critic, A2C ([paper](https://arxiv.org/pdf/1708.05144.pdf) | [code](adept/agents/actor_critic.py))
* Actor Critic Vtrace, IMPALA ([paper](https://arxiv.org/pdf/1802.01561.pdf) | [code](https://arxiv.org/pdf/1802.01561.pdf))### Networks
* Modular Network Interface: supports arbitrary input and output shapes up to
4D via a SubModule API.
* Stateful networks (ie. LSTMs)
* Batch normalization ([paper](https://arxiv.org/pdf/1502.03167.pdf))### Environments
* OpenAI Gym Atari## Performance
* ~ 3,000 Steps/second = 12,000 FPS (Atari)
* Local Mode
* 64 environments
* GeForce 2080 Ti
* Ryzen 2700x 8-core
* Used to win a
[Doom competition](https://www.crowdai.org/challenges/visual-doom-ai-competition-2018-track-2)
(Ben Bell / Marv2in)
![architecture](images/benchmark.png)
* Trained for 50M Steps / 200M Frames
* Up to 30 no-ops at start of each episode
* Evaluated on different seeds than trained on
* Architecture: [Four Convs](./adept/networks/net3d/four_conv.py) (F=32)
followed by an [LSTM](./adept/networks/net1d/lstm.py) (F=512)
* Reproduce with `python -m adept.app local --logdir ~/local64_benchmark --eval
-y --nb-step 50e6 --env `## Acknowledgements
We borrow pieces of OpenAI's [gym](https://github.com/openai/gym) and
[baselines](https://github.com/openai/baselines) code. We indicate where this
is done.