Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mauroluzzatto/q-learning-demo-play-nchain
This repository contains a Jupyter Notebook with an implemenation of a Q-Learning Agent, which learns to solve the n-Chain OpenAI Gym environment
https://github.com/mauroluzzatto/q-learning-demo-play-nchain
demo gym jupyter-notebook openai-gym python q-learning q-learning-algorithm reinforcement-learning
Last synced: 2 months ago
JSON representation
This repository contains a Jupyter Notebook with an implemenation of a Q-Learning Agent, which learns to solve the n-Chain OpenAI Gym environment
- Host: GitHub
- URL: https://github.com/mauroluzzatto/q-learning-demo-play-nchain
- Owner: MauroLuzzatto
- Created: 2019-09-16T18:51:32.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2023-10-08T18:26:28.000Z (over 1 year ago)
- Last Synced: 2024-06-14T02:26:49.681Z (7 months ago)
- Topics: demo, gym, jupyter-notebook, openai-gym, python, q-learning, q-learning-algorithm, reinforcement-learning
- Language: Jupyter Notebook
- Homepage:
- Size: 2.17 MB
- Stars: 4
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Q-Learning Notebook - Play the N-Chain Environment with three Agents
This repository contains a Jupyter Notebook with an implementation of a `Q-Learning` agent, which learns to solve the n-Chain `OpenAI Gym` environmentThis notebook is inspired by the following notebook:
[Deep Reinforcement Learning Course Notebook](https://github.com/simoninithomas/Deep_reinforcement_learning_Course/blob/master/Q%20learning/Taxi-v2/Q%20Learning%20with%20OpenAI%20Taxi-v2%20video%20version.ipynb)## Q-Learning
The notebook contains a `Q-Learning` algorithm implementation and a training loop to solve the n-Chain OpenAI Gym environment. The `Q-Learning` algorithm is an off-policy temporal-difference control algorithm [1]:
[Image](http://incompleteideas.net/book/the-book-2nd.html) taken from **Richard S. Sutton and Andrew G. Barto, Reinforcement Learning: An Introduction, Second edition, 2014/2015, page 158**
## The Q-Learning Agents
In this notebook we let different q-learning agents play the N-Chain evironment and see how they perform in the game. The following agents are implemented:
- 🤓 Smart Agent 1: the agent explores and takes future rewards into account
- 🤑 Greedy Agent 2: the agent cares only about immediate rewards (small gamma)
- 😳 Shy Agent 3: the agent doesn't explore the environment (small epsilon)## The n-Chain Environment
The n-Chain environment is taken from the `OpenAI Gym` module. Documentation:
[n-Chain environment](https://gym.openai.com/envs/NChain-v0/)
The image below shows an example of a 5-Chain (n = 5) environment with 5 states. `a` stands for action and `r` for the reward ([Image Source](https://adventuresinmachinelearning.com/reinforcement-learning-tutorial-python-keras/)).
### States
This environment contains a chain with n positions, and every chain position corresponds to a possible state the agent can be in:
| state | description|
|--- |--- |
| n (default n=5) | n-th postion on the chain |### Actions and Rewards
The agent can move along the chain using two actions for which the agent will get a different rewards:
|action | reward | description |
|---|---|---|
| 0 | get no reward | move forward along the chain (state = n+1) |
| 1 | get a small reward of 2 | jump back to state 0 |**The end of the chain presents a large reward of 10, and while standing at the end of the chain and still moving forward (action 0), the large reward can be gained repeatedly**.
## Additional Resources About Reinforcement Learning
- [OpenAI Gym](https://gym.openai.com/): Gym is a toolkit for developing and comparing reinforcement learning algorithms from OpenAI
- [OpenAI Baselines](https://github.com/openai/baselines): OpenAI Baselines is a set of high-quality implementations of reinforcement learning algorithms
- [Spining Up AI](https://spinningup.openai.com): This is an educational resource produced by OpenAI that makes it easier to learn about deep reinforcement learning
- [A Long Peek into Reinforcement Learning](https://lilianweng.github.io/lil-log/2018/02/19/a-long-peek-into-reinforcement-learning.html): Great blog post from Lilian Weng, where she is briefly going over the field of Reinforcement Learning (RL), from fundamental concepts to classic algorithms
- [Policy Gradient Algorithms](https://lilianweng.github.io/lil-log/2018/04/08/policy-gradient-algorithms.html): Another great blog post from Lilian Weng, where she writes about policy gradient algorithms