Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/absadiki/ppo_pytorch
A simple implementation of the Proximal Policy Optimization (PPO) algorithm using Pytorch.
https://github.com/absadiki/ppo_pytorch
ppo proximal-policy-optimization pytorch reinforcement-learning tensorboard
Last synced: 30 days ago
JSON representation
A simple implementation of the Proximal Policy Optimization (PPO) algorithm using Pytorch.
- Host: GitHub
- URL: https://github.com/absadiki/ppo_pytorch
- Owner: absadiki
- License: gpl-3.0
- Created: 2023-01-01T02:48:36.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2023-01-01T04:10:44.000Z (about 2 years ago)
- Last Synced: 2024-12-17T16:09:31.464Z (about 1 month ago)
- Topics: ppo, proximal-policy-optimization, pytorch, reinforcement-learning, tensorboard
- Language: Python
- Homepage:
- Size: 162 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ppo_pytorch
A simple implementation of the [Proximal Policy Optimization (PPO)](https://arxiv.org/abs/1707.06347) Reinforcement Learning algorithm using Pytorch.
## Some features
* A separate file for hyper-parameters for an easy, practical tuning.
* You can stop/resume the training process any time as the trained models are saved after every epoch in the `models` directory.
* [Tensorboard](https://github.com/tensorflow/tensorboard) support: if you have Tensorboard installed, you can run it to track the progress of the training in real-time using:
```bash
tensorboard --logdir runs
```
## How to use
* Clone the repository to a local folder
```bash
git clone https://github.com/abdeladim-s/ppo_pytorch && cd ppo_pytorch
```
* Install the dependencies
```bash
pip install -r requirements.txt
```
* Run the main file
```bash
python main.py
```
This will run the trained agent of the `CartPole-v0` environment (Similar to the image above).## How to train your own models
* Add your environment to the `config` dictionary inside the `hyper_paramters` file.
This should be a [gymnasium](https://github.com/Farama-Foundation/Gymnasium) (formerly Gym) environment or any subclass of `gymnasium.Env`
```python
config = {
# ...
'Pendulum-v1': {
}
}
```
* Override the `defaults` ppo hyper-parameters or create another new set of hyper-parameters.```python
config = {
# ...
'Pendulum-v1': {
'defaults': {
'epochs': 100,
},
'model-001':{
"seed": 10,
"epochs": 25,
"steps_per_epoch": 1500,
"max_episode_steps": 100,
# ...
"reward_threshold": None
}
}
}
```
* Modify the `main` function with the new `env_name` and `model_id`
```python
def main():env_name = 'Pendulum-v1'
model_id = 'model-001'# ...
```
_*if no `model_id` was given, the `defaults` parameters will be taken by default._* Run the `main.py` file with `train=True` if you want to train the agent or `train=False` if you want to evaluate the trained model.
```python
def main():env_name = 'Pendulum-v1'
model_id = 'model-001'train = True # for training
# train = False # for evaluation
if train:
policy = Policy(env_name, model_id=model_id, render_mode=None)
policy.train()
else:
policy = Policy(env_name, model_id=model_id, render_mode='human')
policy.evaluate(1)```
__Note__
* You can test a simple random agent using the `test_env.py` script.
## license
GPLv3. See `LICENSE` for the full license text.