https://github.com/leo27945875/td3-ant-v2

deep-learning mujoco-environments reinforcement-learning td3-pytorch

Last synced: 2 months ago
JSON representation

Host: GitHub
URL: https://github.com/leo27945875/td3-ant-v2
Owner: leo27945875
Created: 2021-08-25T17:50:34.000Z (over 3 years ago)
Default Branch: master
Last Pushed: 2021-09-18T09:03:47.000Z (over 3 years ago)
Last Synced: 2025-01-29T12:48:00.380Z (4 months ago)
Topics: deep-learning, mujoco-environments, reinforcement-learning, td3-pytorch
Language: Jupyter Notebook
Homepage:
Size: 16.8 MB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: Readme.md

Awesome Lists containing this project

README

# TD3

![](./ant.gif)

## Set the parameters in block "Set the parameters" in `TD3_Ant.ipynb`:
* env_name = Name of enviroment
* seed = Random seed
* start_timesteps = When to start training TD3 model
* eval_freq = Frequency of evaluation
* max_timesteps = Maximum timesteps
* save_models = Need save model ?
* expl_noise = Exploration noise
* batch_size = Batch size
* discount = Discount factor
* tau = The parameter to smoothly update targrt network in TD3 paper
* policy_noise = Policy noise to do target policy smoothing
* noise_clip = Maximum value of policy noise
* policy_freq = Frequency to update the actor and target networks

## Model Training:

After setting up all parameters, just click the button "Run All", then you can train TD3 model and evaluate it. You can find the model file in the folder "./pytorch_models" and reward records of evaluations in "./results/[enviroment name]".

## Results:
![](./exp.png)
* DP: Delayed Policy updates
* TPS: Target Policy Smoothing
* CDQ: Clipped Double Q-learning

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/leo27945875/td3-ant-v2

Awesome Lists containing this project

README