Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/djbyrne/maddpg
Final project for the Udacity RL nano degree implementing Multi Agent Deep Deterministic Policy Gradients
https://github.com/djbyrne/maddpg
Last synced: 12 days ago
JSON representation
Final project for the Udacity RL nano degree implementing Multi Agent Deep Deterministic Policy Gradients
- Host: GitHub
- URL: https://github.com/djbyrne/maddpg
- Owner: djbyrne
- Created: 2019-02-11T07:09:27.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2023-03-24T23:40:37.000Z (almost 2 years ago)
- Last Synced: 2023-08-01T12:16:57.831Z (over 1 year ago)
- Language: ASP
- Homepage:
- Size: 26 MB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# MADDPG
### Introduction
This experiment implements the Multi Agent Deep Deterministic Policy Gradient algorithm to train a two independent agents to learn how to keep passing the ball back and fort (rallying) inside the unity ML-Agents virtual [Tennis](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Learning-Environment-Examples.md#tennis) environment. In this environment, two paddles move within their side of the tennis court in an attempt to hit the ball back to the opposite agent.
![Trained Agent](/images/trained_maddpg.gif)
### Rewards
+0.1 To agent when hitting ball over net.
-0.1 To agent who let ball hit their ground, or hit ball out of bounds.### State Space
The observation space consists of 8 variables corresponding to position and velocity of the ball and agents racket
### Action Space
Continuous) Size of 2, corresponding to movement toward net or away from net, and jumping.
### Solving the Environment
The environment is considered solved when the average score of all agents in the environment (in this case 2) for a period of 100 episodes is 0.5 or above, with a max score of 2.5 .### Setup
1. Create (and activate) a new environment with Python 3.6.
- __Linux__ or __Mac__:
```bash
conda create --name ddpg python=3.6
source activate ddpg
```
- __Windows__:
```bash
conda create --name ddpg python=3.6
activate ddpg
```2. Clone the repository and install dependencies.
```bash
git clone https://github.com/djbyrne/MADDPG.git
cd MADDPG
pip install .
```
3. Download the environment from one of the links below. You need only select the environment that matches your operating system:- Linux: [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P2/Reacher/Reacher_Linux.zip)
- Mac OSX: [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P2/Reacher/Reacher.app.zip)
- Windows (32-bit): [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P2/Reacher/Reacher_Windows_x86.zip)
- Windows (64-bit): [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P2/Reacher/Reacher_Windows_x86_64.zip)
4. Finally, run the setup.py file to install all dependencies for this project### Instructions
All code for this project is contained in the MADDPG.ipynb notebook. As such you just need to run the cells in order to see the results.