https://github.com/absadiki/ppo_pytorch

A simple implementation of the Proximal Policy Optimization (PPO) algorithm using Pytorch.
https://github.com/absadiki/ppo_pytorch

ppo proximal-policy-optimization pytorch reinforcement-learning tensorboard

Last synced: 6 months ago
JSON representation

A simple implementation of the Proximal Policy Optimization (PPO) algorithm using Pytorch.

Host: GitHub
URL: https://github.com/absadiki/ppo_pytorch
Owner: absadiki
License: gpl-3.0
Created: 2023-01-01T02:48:36.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2023-01-01T04:10:44.000Z (almost 3 years ago)
Last Synced: 2025-03-29T09:34:22.483Z (7 months ago)
Topics: ppo, proximal-policy-optimization, pytorch, reinforcement-learning, tensorboard
Language: Python
Homepage:
Size: 162 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # ppo_pytorch

A simple implementation of the [Proximal Policy Optimization (PPO)](https://arxiv.org/abs/1707.06347) Reinforcement Learning algorithm using Pytorch.







## Some features

* A separate file for hyper-parameters for an easy, practical tuning.

* You can stop/resume the training process any time as the trained models are saved after every epoch in the `models` directory.

* [Tensorboard](https://github.com/tensorflow/tensorboard) support: if you have Tensorboard installed, you can run it to track the progress of the training in real-time using:

```bash

tensorboard --logdir runs

```

## How to use

* Clone the repository to a local folder

```bash 

git clone https://github.com/abdeladim-s/ppo_pytorch && cd ppo_pytorch

```

* Install the dependencies

```bash

pip install -r requirements.txt

```

* Run the main file

```bash 

python main.py 

```

This will run the trained agent of the `CartPole-v0` environment (Similar to the image above).

## How to train your own models

* Add your environment to the `config` dictionary inside the `hyper_paramters` file.

This should be a [gymnasium](https://github.com/Farama-Foundation/Gymnasium) (formerly Gym) environment or any subclass of `gymnasium.Env`

```python

config = {

    # ...

    'Pendulum-v1': {

        

    }

}

```

* Override the `defaults` ppo hyper-parameters or create another new set of hyper-parameters.

```python

config = {

    # ...

    'Pendulum-v1': {

        'defaults': {

            'epochs': 100,

        },

        'model-001':{

            "seed": 10,

            "epochs": 25,

            "steps_per_epoch": 1500,

            "max_episode_steps": 100,

            # ...

            "reward_threshold": None

        }

    }

}

```

* Modify the `main` function with the new `env_name` and `model_id`

```python

def main():

    env_name = 'Pendulum-v1'

    model_id = 'model-001'

   # ...

```

_*if no `model_id` was given, the `defaults` parameters will be taken by default._ 

* Run the `main.py` file with `train=True` if you want to train the agent or `train=False` if you want to evaluate the trained model.

```python

def main():

    env_name = 'Pendulum-v1'

    model_id = 'model-001'

    train = True  # for training

    #  train = False  # for evaluation

    if train:

        policy = Policy(env_name, model_id=model_id, render_mode=None)

        policy.train()

    else:

        policy = Policy(env_name, model_id=model_id, render_mode='human')

        policy.evaluate(1)

```

__Note__

* You can test a simple random agent using the `test_env.py` script.

## license

GPLv3. See `LICENSE` for the full license text.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/absadiki/ppo_pytorch

Awesome Lists containing this project

README