https://github.com/kwea123/hindsight_experience_replay
A tensorflow implementation of hindsight experience replay
https://github.com/kwea123/hindsight_experience_replay
deep-q-learning hindsight-experience-replay jupyter-notebook python reinforcement-learning tensorflow
Last synced: about 2 months ago
JSON representation
A tensorflow implementation of hindsight experience replay
- Host: GitHub
- URL: https://github.com/kwea123/hindsight_experience_replay
- Owner: kwea123
- Created: 2018-03-12T02:44:42.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2018-04-19T08:33:31.000Z (about 7 years ago)
- Last Synced: 2025-03-21T22:22:03.047Z (about 2 months ago)
- Topics: deep-q-learning, hindsight-experience-replay, jupyter-notebook, python, reinforcement-learning, tensorflow
- Language: Jupyter Notebook
- Homepage:
- Size: 1.67 MB
- Stars: 17
- Watchers: 3
- Forks: 9
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Hindsight Experience Replay (HER)
This repository contains a tensorflow HER implementation and a bit flipping environment as described in [OpenAI's paper](https://arxiv.org/pdf/1707.01495.pdf)
The implementation includes :
1. In `Hindsight Experience Replay.ipynb` :
1. A DQN and a DDQN agent (which also work on other traditional [gym](https://gym.openai.com/) environments)
2. A bit flipping environment
3. Pre-trained models for 30-bits, 40-bits and 50-bits flipping environments
2. In `ChaseEnv_DDPG.ipynb` :
1. A DDPG agent
2. A `ChaseEnv` environment, where a chaser is initialized at a random position in
a 2d plane and has to reach a goal in another random position within a certain threshold.## Benchmarks
* 100% success rate for 30 and 40-bits environments
* 95% success rate for 50-bits environment (average on 100 tests)
* 90% success rate for size=5 ChaseEnv (average on 100 tests)## Customize
Check the "Training" cell to adjust training parameters and enable/disable HER.## TODO
- [x] Optimize the way to concatenate transitions
- [ ] Parallelize training
- [x] Train on bit length > 30
- [x] Implement DDPG## Extra
[Here](https://github.com/kwea123/RL/blob/master/ai/unity_test/robot_arm/robot_arm_3d_ddpg_her_sparse.ipynb) is a link to a robot arm reach environment created in Unity, trained with [ML-Agents](https://github.com/Unity-Technologies/ml-agents).This environment is trained using DDPG with and without HER, and the comparison is plotted. DDPG+HER performs better.