https://github.com/rvk007/multi-env-decision-making
Making a single agent navigate through traffic across different environments.
https://github.com/rvk007/multi-env-decision-making
decision-making deep-q-network deep-reinforcement-learning multi-task-learning reinforcement-learning
Last synced: 7 months ago
JSON representation
Making a single agent navigate through traffic across different environments.
- Host: GitHub
- URL: https://github.com/rvk007/multi-env-decision-making
- Owner: rvk007
- License: apache-2.0
- Created: 2021-11-17T20:35:01.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2021-12-18T06:36:05.000Z (about 4 years ago)
- Last Synced: 2025-02-26T04:26:08.858Z (11 months ago)
- Topics: decision-making, deep-q-network, deep-reinforcement-learning, multi-task-learning, reinforcement-learning
- Language: Python
- Homepage: https://shantanuacharya.notion.site/shantanuacharya/Multi-Env-Decision-Making-d40e0ad783e64eebbb755756306e8ed9
- Size: 495 KB
- Stars: 2
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Multi-Env Decision Making
[](https://shantanuacharya.notion.site/Multi-Env-Decision-Making-d40e0ad783e64eebbb755756306e8ed9)
One of the current major areas of research in Reinforcement Learning is trying to make policies that are generalizable. Currently, in most of the tasks, a policy trained to perform well in an environment starts to struggle when even deployed in a slightly different environment.
We aim to tackle this issue by creating a **_multi-environment_ decision-making policy** that can perform well in different environment settings. We try to achieve this goal by using **Deep Q Network (DQN)** and **modifying its policy network**.
## Environment
We use [Elurent’s Highway-env](https://github.com/eleurent/highway-env) project which offers a collection of environments for autonomous driving and tactical decision-making tasks. From this project, we used three environments named Highway, Merge, and Roundabout for our experiments. The goal of the agent (a.k.a the ego vehicle) is to drive at a high speed without colliding with neighboring vehicles.
Highway
Merge
Roundabout
## Policy
We chose **Deep Q Network** (DQN) to solve this task. The policy network is modified to use an encoder-decoder based Multi-Layer Feed Forward Policy Network (MlpPolicy). The encoder is shared by all the environments and hence learns features that are common to all environments. On the other hand, each environment has a separate decoder that learns the environment-specific features.
To train the multi-env policy, the observations are rolled out randomly from all three environments. This enables the policy to learn all environments at once and ensure the presence of a variety of observations in the experience replay buffer.
## Setup Instructions
To run the scripts, first install the necessary packages by running the command
`$ pip install -r requirements.txt`
## Usage
To train the policy
`python run.py --config `
To evaluate a policy (without rendering videos in real-time)
`python run.py --config -m test -p `
To evaluate a policy (with video rendering in real-time)
`python run.py --config -m test -p --render_video`
## Contact/Getting Help
To know more about the experiments we conducted and their results, please refer to the [blog](https://shantanuacharya.notion.site/Multi-Env-Decision-Making-d40e0ad783e64eebbb755756306e8ed9). If you still have questions, feel free to raise an issue.