https://github.com/qasimwani/td-learning-openai
Teaching an RL agent to perform a task using Temporal Difference control algorithms.
https://github.com/qasimwani/td-learning-openai
Last synced: 3 months ago
JSON representation
Teaching an RL agent to perform a task using Temporal Difference control algorithms.
- Host: GitHub
- URL: https://github.com/qasimwani/td-learning-openai
- Owner: QasimWani
- Created: 2020-07-19T15:33:09.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2020-07-24T02:15:32.000Z (about 5 years ago)
- Last Synced: 2025-05-18T11:06:39.804Z (5 months ago)
- Language: Jupyter Notebook
- Size: 209 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Temporal Difference Learning (RL)
Teaching an RL agent to perform a task using Temporal Difference control algorithms.Implemented Sarsa, Q-learning (Sarsa max), & expected sarsa from scratch to teach an RL agent (Taxi-v3) to complete an episode with the maximum expected reward by estimating the optimal policy π.
Gridworld:
`+---------+`
`|R: | : :G|`
`| : | : : |`
`| : : : : |`
`| | : | : |`
`|Y| : |B: |`
`+---------+`