https://github.com/fszewczyk/rocket-landing-rl
Custom OpenAI Gym for vertical rocket landing and Deep Q-Learning implementation.
https://github.com/fszewczyk/rocket-landing-rl
deep-q-learning deep-q-network q-learning reinforcement-learning rocket-landing thrust-vector-control
Last synced: 5 months ago
JSON representation
Custom OpenAI Gym for vertical rocket landing and Deep Q-Learning implementation.
- Host: GitHub
- URL: https://github.com/fszewczyk/rocket-landing-rl
- Owner: fszewczyk
- License: other
- Created: 2022-12-09T12:43:17.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2024-02-11T22:04:10.000Z (over 2 years ago)
- Last Synced: 2025-11-27T19:55:01.119Z (6 months ago)
- Topics: deep-q-learning, deep-q-network, q-learning, reinforcement-learning, rocket-landing, thrust-vector-control
- Language: Python
- Homepage: https://pypi.org/project/rocketgym/
- Size: 2.49 MB
- Stars: 17
- Watchers: 2
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
_Part of [SHKYERA](https://youtu.be/Kb4bNZGqKyE) project_


# Rocket Landing - Reinforcement Learning
## Environment
I made a custom OpenAI-Gym environment with fully functioning 2D physics engine. If you want to test your own algorithms using that, download the package by simply typing in terminal:
```
pip install rocketgym
```
All the environment's functionalities are described [here](environment/README.md).
### Minimal usage
Make sure that all dependencies are installed by `pip install -r requirements.txt`
```python
from rocketgym.environment import Environment
import random
env = Environment()
observation = env.reset()
done = False
while not done:
observation, reward, done, info = env.step(random.randint(0,3))
env.render()
```
## Learning
```
python3 train.py -h
usage: Rocket Landing - Reinforcemeng Learning [-h] [--curriculum] [--softmax] [--save] [-model MODEL]
optional arguments:
-h, --help show this help message and exit
--curriculum Use Curriculum Learning
--softmax Use Softmax exploration instead of eps-greedy
--save Save flight logs and models every 100 episodes
-model MODEL Path to the model to load. Overrides the curriculum and exploration
settings. Renders the scene from the start.
```
In the `train.py` you can see, how agent training is implemented. All you need to do is specify the exploration strategy and adjust the environment to your needs. I found that it takes around 2000 iterations to learn to land without any curriculum learning, but the process can be significantly sped up by setting up a task difficulty schedule. This can be easily done through the `Curriculum` module.
## Diagnostics
If you want to make pretty plots, like this one

feel free to use `diagnostics.py`. All you need to know to do that is described in the script itself.
_**For a detailed explanation of the environment and the learning algorithms I used, see [here](https://drive.google.com/file/d/1iqoxaIz_gqfDMqdZBwWLJYfiu0FzKDsv/view?usp=sharing).**_