https://github.com/ghubnerr/lunar_lander
A Deep Reinforcement Learning Agent that learns how to Land a rocket from a Box2D Gymnasium Environment!
https://github.com/ghubnerr/lunar_lander
dqn pytorch reinforcement-learning
Last synced: 8 months ago
JSON representation
A Deep Reinforcement Learning Agent that learns how to Land a rocket from a Box2D Gymnasium Environment!
- Host: GitHub
- URL: https://github.com/ghubnerr/lunar_lander
- Owner: ghubnerr
- License: apache-2.0
- Created: 2023-10-02T21:31:51.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2023-10-03T02:10:19.000Z (about 2 years ago)
- Last Synced: 2024-12-30T19:58:53.483Z (10 months ago)
- Topics: dqn, pytorch, reinforcement-learning
- Language: Python
- Homepage: https://gymnasium.farama.org/environments/box2d/lunar_lander/
- Size: 396 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Lunar Lander Deep Q-Network from OpenAI Gymnasium!
"This environment is a classic rocket trajectory optimization problem. According to Pontryagin’s maximum principle, it is optimal to fire the engine at full throttle or turn it off. This is the reason why this environment has discrete actions: engine on or off.
There are two environment versions: discrete or continuous. The landing pad is always at coordinates (0,0). The coordinates are the first two numbers in the state vector. Landing outside of the landing pad is possible. Fuel is infinite, so an agent can learn to fly and then land on its first attempt."### [Documentation: Gymnasium](https://gymnasium.farama.org/environments/box2d/lunar_lander/)
### Usage and Packages
`pip install torch gymnasium 'gymnasium[box2d]'`You might need to install Box2D Separately, which requires a `swig` package to compile code from Python into C/C++, which is the language that Box2d was built in:
`brew install swig``pip install box2d`
### Average Score: 137.91
For each step, the reward:
- is increased/decreased the closer/further the lander is to the landing pad.
- is increased/decreased the slower/faster the lander is moving.
- is decreased the more the lander is tilted (angle not horizontal).
- is increased by 10 points for each leg that is in contact with the ground.
- is decreased by 0.03 points each frame a side engine is firing.
- is decreased by 0.3 points each frame the main engine is firing.The episode receive an additional reward of -100 or +100 points for crashing or landing safely respectively. An episode is considered a solution if it scores at least 200 points.**
### `land()` and `learn_to_land()`
`land()` function loads a pre-trained model that ran through 500 episodes of training, while `learn_to_land()` does training from scratch. You can edit which one of the functions is running from the bottom of the `main.py` file. If you set `render_mode=False`, the program will train a lot faster.