An open API service indexing awesome lists of open source software.

https://github.com/leo27945875/td3-ant-v2


https://github.com/leo27945875/td3-ant-v2

deep-learning mujoco-environments reinforcement-learning td3-pytorch

Last synced: 2 months ago
JSON representation

Awesome Lists containing this project

README

        

# TD3

![](./ant.gif)

## Set the parameters in block "Set the parameters" in `TD3_Ant.ipynb`:
* env_name = Name of enviroment
* seed = Random seed
* start_timesteps = When to start training TD3 model
* eval_freq = Frequency of evaluation
* max_timesteps = Maximum timesteps
* save_models = Need save model ?
* expl_noise = Exploration noise
* batch_size = Batch size
* discount = Discount factor
* tau = The parameter to smoothly update targrt network in TD3 paper
* policy_noise = Policy noise to do target policy smoothing
* noise_clip = Maximum value of policy noise
* policy_freq = Frequency to update the actor and target networks

## Model Training:

After setting up all parameters, just click the button "Run All", then you can train TD3 model and evaluate it. You can find the model file in the folder "./pytorch_models" and reward records of evaluations in "./results/[enviroment name]".

## Results:
![](./exp.png)
* DP: Delayed Policy updates
* TPS: Target Policy Smoothing
* CDQ: Clipped Double Q-learning