https://github.com/wangshusen/DRL

Deep Reinforcement Learning
https://github.com/wangshusen/DRL

Last synced: 5 months ago
JSON representation

Deep Reinforcement Learning

Host: GitHub
URL: https://github.com/wangshusen/DRL
Owner: wangshusen
License: other
Created: 2020-07-19T16:48:21.000Z (almost 5 years ago)
Default Branch: master
Last Pushed: 2022-12-10T13:25:34.000Z (over 2 years ago)
Last Synced: 2024-11-17T12:21:47.537Z (6 months ago)
Size: 192 MB
Stars: 3,350
Watchers: 42
Forks: 589
Open Issues: 41
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

StarryDivineSky - wangshusen/DRL

README

        # Deep Reinforcement Learning

1. **Overview.**

    * Reinforcement Learning 

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/1_Basics_1.pdf)] 

    [[lecture note](https://github.com/wangshusen/DeepLearning/blob/master/LectureNotes/DRL/DRL.pdf)] 

    [[Video (in Chinese)](https://youtu.be/vmkRMvhCW5c)].

    * Value-Based Learning 

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/1_Basics_2.pdf)] 

    [[Video (in Chinese)](https://youtu.be/jflq6vNcZyA)].

    * Policy-Based Learning 

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/1_Basics_3.pdf)] 

    [[Video (in Chinese)](https://youtu.be/qI0vyfR2_Rc)].

    * Actor-Critic Methods 

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/1_Basics_4.pdf)] 

    [[Video (in Chinese)](https://youtu.be/xjd7Jq9wPQY)].

    * AlphaGo 

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/1_Basics_5.pdf)] 

    [[Video (in Chinese)](https://youtu.be/zHojAp5vkRE)].

    

    

2. **TD Learning.**

    

    * Sarsa

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/2_TD_1.pdf)] 

    [[Video (in Chinese)](https://youtu.be/-cYWdUubB6Q)].

    

    * Q-learning

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/2_TD_2.pdf)] 

    [[Video (in Chinese)](https://youtu.be/Ymy2w3DGn2U)].

    

    * Multi-Step TD Target

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/2_TD_3.pdf)] 

    [[Video (in Chinese)](https://youtu.be/UqTP138IATc)].

    

    

    

3. **Advanced Topics on Value-Based Learning.**

    * Experience Replay (ER) & Prioritized ER

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/3_DQN_1.pdf)]

    [[Video (in Chinese)](https://youtu.be/rhslMPmj7SY)].

    

    * Overestimation, Target Network, & Double DQN

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/3_DQN_2.pdf)] 

    [[Video (in Chinese)](https://youtu.be/X2-56QN79zc)].

    

    * Dueling Networks

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/3_DQN_3.pdf)]

    [[Video (in Chinese)](https://youtu.be/DBux6cA0EoM)].

4. **Policy Gradient with Baseline.**

    * Policy Gradient with Baseline

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/4_Policy_1.pdf)]

    [[Video (in Chinese)](https://youtu.be/yNEqbptitZs)].

    

    * REINFORCE with Baseline

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/4_Policy_2.pdf)]

    [[Video (in Chinese)](https://youtu.be/Ob78ADXTQNo)].

    

    * Advantage Actor-Critic (A2C)

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/4_Policy_3.pdf)]

    [[Video (in Chinese)](https://youtu.be/mtT4TSGSon8)].

    

    * REINFORCE versus A2C

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/4_Policy_4.pdf)]

    [[Video (in Chinese)](https://youtu.be/hN9WMIMMeAI)].

    

5. **Advanced Topics on Policy-Based Learning.**

    

    * Trust-Region Policy Optimization (TRPO)

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/5_Policy_1.pdf)]

    [[Video (in Chinese)](https://youtu.be/fcSYiyvPjm4)].

    

    * Partial Observation and RNNs.

6. **Dealing with Continuous Action Space.**

    * Discrete versus Continuous Control

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/6_Continuous_1.pdf)]

    [[Video (in Chinese)](https://youtu.be/rRIjgdxSvg8)].

    * Deterministic Policy Gradient (DPG) for Continuous Control

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/6_Continuous_2.pdf)] 

    [[Video (in Chinese)](https://youtu.be/cmWejKRWLA8)].

    * Stochastic Policy Gradient for Continuous Control

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/6_Continuous_3.pdf)] 

    [[Video (in Chinese)](https://youtu.be/McqFyl_W5Wc)].

    

    

7. **Multi-Agent Reinforcement Learning.**

    * Basics and Challenges 

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/7_MARL_1.pdf)] 

    [[Video (in Chinese)](https://youtu.be/KN-XMQFTD0o)].

    * Centralized VS Decentralized 

    [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/7_MARL_2.pdf)] 

    [[Video (in Chinese)](https://youtu.be/0HV1hsjd1y8)].

8. **Imitation Learning.**

    * Inverse Reinforcement Learning.

    

    * Generative Adversarial Imitation Learning (GAIL).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/wangshusen/DRL

Awesome Lists containing this project

README