https://github.com/instadeepai/fastpbrl
Vectorization techniques for fast population-based training.
https://github.com/instadeepai/fastpbrl
jax population-based-training reinforcement-learning vectorization
Last synced: 3 months ago
JSON representation
Vectorization techniques for fast population-based training.
- Host: GitHub
- URL: https://github.com/instadeepai/fastpbrl
- Owner: instadeepai
- License: apache-2.0
- Created: 2022-06-22T07:20:20.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-08-12T09:22:14.000Z (over 3 years ago)
- Last Synced: 2025-04-06T11:24:28.734Z (10 months ago)
- Topics: jax, population-based-training, reinforcement-learning, vectorization
- Language: Python
- Homepage: https://arxiv.org/abs/2206.08888
- Size: 106 KB
- Stars: 55
- Watchers: 6
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Fast Population-Based Reinforcement Learning
[](https://www.python.org/downloads/release/python-380/)
[](https://jax.readthedocs.io/en/latest/)
[](https://github.com/psf/black)
[](https://github.com/pre-commit/pre-commit)
This repository contains the code for the paper "Fast Population-Based Reinforcement Learning on a Single Machine paper from
InstaDeep",
[(Flajolet et al., 2022)](https://arxiv.org/pdf/2206.08888.pdf)
:computer::zap:.
## First-time setup
### Install Docker
This code requires docker to run. To install docker please follow the online instructions
[here](https://docs.docker.com/engine/install/ubuntu/). To enable the code to run on GPU, please
install [Nvidia-docker](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) (as well as
the latest nvidia driver available for your GPU).
### Build and run a docker image
Once docker and docker Nvidia are installed, you can simply build the docker image with the following command:
```bash
make build
```
and, once the image is built, start the container with:
```bash
make dev_container
```
Inside the container, you can run the `nvidia-smi` command to verify that your GPU is found.
## Run preconfigured scripts
### Replicate the experiments from the paper
We provide scripts and commands to replicate the experiments discussed in the paper. All these commands are
defined in the Makefile at the root of the repository.
To replicate the experiments corresponding to Figure 2 (where we measure the runtime of a population-wide
update step with various implementations), run:
```
make run_timing_sactd3
make run_timing_dqn
```
To replicate the experiments discussed in Section 5 (which correspond to full training runs), run the following:
```
make run_td3_cemrl
make run_td3_dvd
make run_td3_pbt
make run_sac_pbt
```
Note that dvd training runs are unstable and sometimes crash early on due to NaNs.
We use `tensorboard` to log metrics during the training run. The tensorboard command
to run to visualize them is printed when the experiment starts.
### Launch a test script
Run the following command to start a short test which validates that the code in the training scripts is working
as expected.
```
make test_training_scripts
```
## Contributors
## Citing this work
If you use the code or data in this package, please cite:
```bibtex
@inproceedings{flajolet2022fast,
title={Fast Population-Based Reinforcement Learning on a Single Machine},
author={Flajolet, Arthur and Monroc, Claire Bizon and Beguir, Karim and Pierrot, Thomas},
booktitle={International Conference on Machine Learning},
pages={6533--6547},
year={2022},
organization={PMLR}
}
```



