https://github.com/wongongv/PolicyGradient_in_tensorflow2.0

Implemented Policy Gradient in Tensorflow2.0
https://github.com/wongongv/PolicyGradient_in_tensorflow2.0

Last synced: 5 months ago
JSON representation

Implemented Policy Gradient in Tensorflow2.0

Host: GitHub
URL: https://github.com/wongongv/PolicyGradient_in_tensorflow2.0
Owner: wsonv
License: mit
Created: 2019-04-29T05:17:29.000Z (about 6 years ago)
Default Branch: master
Last Pushed: 2019-06-24T08:22:11.000Z (about 6 years ago)
Last Synced: 2024-08-10T11:01:21.495Z (11 months ago)
Language: Python
Homepage:
Size: 167 KB
Stars: 3
Watchers: 0
Forks: 1
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-tensorflow-2 - Implemented Policy Gradient in Tensorflow2.0

README

# PolicyGradient_in_tensorflow2.0
Implementation of Policy Gradient in Tensorflow2.0

* This code referenced skeleton code, which is in TensorFlow 1.x, offered in the course [CS294-112 github](https://github.com/berkeleydeeprlcourse/homework/tree/master/hw2)
* Implemented for both continuous and discrete action spaces with reward-to-go option.
* Implemented q function normalization to decrease variation.

## Prerequisites
* python3
* Tensorflow2.0
* Tensorflow_probability (you need to use nightly build to use it in TF2.0)
* numpy
* gym

## Example Usage

### Training
*available arguments : --discount, --n_experiment, --n_iter, --seed, --batch, --learning_rate, --exp_name. --render

$ python PG_in_tf2.0.py CartPole-v0 --exp_name Test_disc --seed 1 --render
$ python PG_in_tf2.0.py Pendulum-v0 --exp_name Test_cont --seed 1

### Plotting
$ python plot.py Data/your_exp_name
$ python plot.py Data/your_exp_name -f

## Example Result
CartPole-v0
iteration : 100
number of experiments : 2
![CartPole](result/CartPole-full.png)

## Author

Wonjun Son / [Github](https://github.com/wongongv)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/wongongv/PolicyGradient_in_tensorflow2.0

Awesome Lists containing this project

README