https://github.com/sash-a/es_pytorch
High performance implementation of Deep neuroevolution in pytorch using mpi4py. Intended for use on HPC clusters
https://github.com/sash-a/es_pytorch
ai deep-neuroevolution evolutionary-strategy gym mpi mpi4py neuroevolution pytorch reinforcement-learning
Last synced: 5 months ago
JSON representation
High performance implementation of Deep neuroevolution in pytorch using mpi4py. Intended for use on HPC clusters
- Host: GitHub
- URL: https://github.com/sash-a/es_pytorch
- Owner: sash-a
- Created: 2020-07-05T19:16:24.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2022-01-24T10:24:42.000Z (over 3 years ago)
- Last Synced: 2025-03-30T09:31:26.204Z (6 months ago)
- Topics: ai, deep-neuroevolution, evolutionary-strategy, gym, mpi, mpi4py, neuroevolution, pytorch, reinforcement-learning
- Language: Python
- Homepage:
- Size: 6.36 MB
- Stars: 26
- Watchers: 2
- Forks: 0
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## *Depreciated in faviour of my [much faster implementation in Julia](https://github.com/sash-a/ScalableES.jl/)*
# Evolutionary strategies (deep neuroevolution) in pytorch using MPI
This implementation was made to be as simple and efficient as possible.
Reference implementation can be found [here](https://github.com/uber-research/deep-neuroevolution) (in tensorflow using redis).
Based on two papers by uber AI labs [here](https://arxiv.org/abs/1712.06567) and [here](https://arxiv.org/abs/1712.06560).### Implementation
This was made for use on a cluster using MPI (however it can be used on a single machine). With regards to efficiency it
only scatters the positive fitness, negative fitness and noise index, per policy evaluated, to all other processes each generation. The noise is placed in a block
of shared memory (on each node) for fast access and low memory footprint.### How to run
* conda install: `conda install -n es_env -f env.yml`
* example usages: `simple_example.py` `obj.py` `nsra.py`
* example configs are in `config/````
conda activate es_env
mpirun -np {num_procs} python simple_example.py configs/simple_conf.json
```Make sure that you insert this line before you create your neural network as the initial creation sets the
initial parameters, which must be deterministic across all threads
```
torch.random.manual_seed({seed})
```### General info
* In order to define a policy create a `src.nn.nn.BaseNet` (which is a simple extension of a `torch.nn.Module`) and
pass it to a `Policy` along with an `src.nn.optimizers.Optimizer` and float value for the noise standard deviation, an
example of this can be seen in `simple_example.py`.
* If you wish to share the noise using shared memory and MPI, then instantiate the `NoiseTable` using
`NoiseTable.create_shared(...)`, otherwise if you wish to use your own method of sharing noise/running
sequentially then simply create the noise table using its constructor and pass your noise to it like this:
`NoiseTable(my_noise, n_params)`
* `NoiseTable.create_shared(...)` will throw an error if less than 2 MPI procs are used