Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/codeamt/maze_rl
a multi-agent cooperative maze game using gym.
https://github.com/codeamt/maze_rl
Last synced: about 2 months ago
JSON representation
a multi-agent cooperative maze game using gym.
- Host: GitHub
- URL: https://github.com/codeamt/maze_rl
- Owner: codeamt
- Created: 2021-01-09T00:58:33.000Z (almost 4 years ago)
- Default Branch: master
- Last Pushed: 2021-01-09T06:04:21.000Z (almost 4 years ago)
- Last Synced: 2023-10-20T04:52:17.603Z (about 1 year ago)
- Language: Python
- Size: 149 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# maze_rl
A multi-agent cooperative maze game using Open AI's gym.
## About
This reinforcement learning (RL) game uses Proximal Policy Optimization (PPO2) to train two agents to cooperate in figuring there way out of a n*m (5x5) grid world (the "Maze").**Rules:**
- Agents must stay within bounds of the nxm Maze
- Agents cannot occupy the same position simultaneously**Entities:**
- 0: Available Space
- 1: Agent 1
- 2: Agent 2
- 3: A Trap (Instant Loss)
- 4: A Teleportation Portal (Moves partner 3 steps North)
- 5: An Exit (A win)**Action Space (Discrete):**
- 0: Up
- 1: Right
- 2: Down
- 3: Left**Observation Space:**
- (np.array) -> next player and maze world, flattened, concatenated on horizontal axis.**State:**
- P: In Play
- L: Agents Lost
- W: Agents Won**Reward Function/Incentive Mechanism:**
- Reward Range: (-200, 200)
- Incentives: +1 for Exploring
- Penalties: -2 for moving to a previopusly visited cell## Getting started
Clone the repo:
```
https://github.com/codeamt/maze_rl.git
```Install depemndencies:
```
cd maze_rl && pip install -r requirements.txt
```## Running Episodes (Local)
Change into src directory and run script:
```
cd code && python main.py
```
### Args:- **--epochs**
(type: int, default: 500000)
Number of epochs- **--lr**
(type: float, default: 0.001)
Learning rate for training policy- **--gamma**
(type: float, default: 0.000001)
Discount Factor- **--lam**
(type; float, default: 0)
Lambda/GAE Factor- **--world** (type: List[List[int],
default:
[[1, 0, 2, 0, 0],
[0, 0, 0, 0, 0],
[4, 0, 0, 0, 0],
[0, 0, 3, 3, 0],
[0, 0, 3, 5, 0]]- **--inference** (type: bool, default: True)
Whether or not to run a quick inference test after training to test performance.After some warnings from Tensorflow, you will see the updated maze after each steps on the output.
Check the render folder for a training report.## Docker Build/Run Instructions
Build the image:
```
docker build -t .
```
Run the container:
```
docker run -d -it --name= python main.py --epochs=500000
```