Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/LucasAlegre/sumo-rl

Reinforcement Learning environments for Traffic Signal Control with SUMO. Compatible with Gymnasium, PettingZoo, and popular RL libraries.
https://github.com/LucasAlegre/sumo-rl

deep-reinforcement-learning gym gym-env gymnasium machine-learning pettingzoo python reinforcement-learning rl-algorithms sumo traffic-signal-control

Last synced: about 1 month ago
JSON representation

Reinforcement Learning environments for Traffic Signal Control with SUMO. Compatible with Gymnasium, PettingZoo, and popular RL libraries.

Host: GitHub
URL: https://github.com/LucasAlegre/sumo-rl
Owner: LucasAlegre
License: mit
Created: 2018-12-10T18:04:33.000Z (over 5 years ago)
Default Branch: main
Last Pushed: 2024-05-07T14:50:39.000Z (about 2 months ago)
Last Synced: 2024-05-07T15:27:41.347Z (about 2 months ago)
Topics: deep-reinforcement-learning, gym, gym-env, gymnasium, machine-learning, pettingzoo, python, reinforcement-learning, rl-algorithms, sumo, traffic-signal-control
Language: Python
Homepage: https://lucasalegre.github.io/sumo-rl
Size: 41.2 MB
Stars: 621
Watchers: 11
Forks: 179
Open Issues: 16
Metadata Files:
- Readme: README.md
- License: LICENSE
- Citation: CITATION.bib

Lists

awesome-deep-reinforcement-learning - LucasAlegre/sumo-rl

README

        

[![DOI](https://zenodo.org/badge/161216111.svg)](https://zenodo.org/doi/10.5281/zenodo.10869789)

[![tests](https://github.com/LucasAlegre/sumo-rl/actions/workflows/linux-test.yml/badge.svg)](https://github.com/LucasAlegre/sumo-rl/actions/workflows/linux-test.yml)

[![PyPI version](https://badge.fury.io/py/sumo-rl.svg)](https://badge.fury.io/py/sumo-rl)

[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https://pre-commit.com/)

[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

[![License](http://img.shields.io/badge/license-MIT-brightgreen.svg?style=flat)](https://github.com/LucasAlegre/sumo-rl/blob/main/LICENSE)

# SUMO-RL

SUMO-RL provides a simple interface to instantiate Reinforcement Learning (RL) environments with [SUMO](https://github.com/eclipse/sumo) for Traffic Signal Control.

Goals of this repository:

- Provide a simple interface to work with Reinforcement Learning for Traffic Signal Control using SUMO

- Support Multiagent RL

- Compatibility with gymnasium.Env and popular RL libraries such as [stable-baselines3](https://github.com/DLR-RM/stable-baselines3) and [RLlib](https://docs.ray.io/en/main/rllib.html)

- Easy customisation: state and reward definitions are easily modifiable

The main class is [SumoEnvironment](https://github.com/LucasAlegre/sumo-rl/blob/main/sumo_rl/environment/env.py).

If instantiated with parameter 'single-agent=True', it behaves like a regular [Gymnasium Env](https://github.com/Farama-Foundation/Gymnasium).

For multiagent environments, use [env](https://github.com/LucasAlegre/sumo-rl/blob/main/sumo_rl/environment/env.py) or [parallel_env](https://github.com/LucasAlegre/sumo-rl/blob/main/sumo_rl/environment/env.py) to instantiate a [PettingZoo](https://github.com/PettingZoo-Team/PettingZoo) environment with AEC or Parallel API, respectively.

[TrafficSignal](https://github.com/LucasAlegre/sumo-rl/blob/main/sumo_rl/environment/traffic_signal.py) is responsible for retrieving information and actuating on traffic lights using [TraCI](https://sumo.dlr.de/wiki/TraCI) API.

For more details, check the [documentation online](https://lucasalegre.github.io/sumo-rl/).

## Install

### Install SUMO latest version:

```bash

sudo add-apt-repository ppa:sumo/stable

sudo apt-get update

sudo apt-get install sumo sumo-tools sumo-doc

```

Don't forget to set SUMO_HOME variable (default sumo installation path is /usr/share/sumo)

```bash

echo 'export SUMO_HOME="/usr/share/sumo"' >> ~/.bashrc

source ~/.bashrc

```

Important: for a huge performance boost (~8x) with Libsumo, you can declare the variable:

```bash

export LIBSUMO_AS_TRACI=1

```

Notice that you will not be able to run with sumo-gui or with multiple simulations in parallel if this is active ([more details](https://sumo.dlr.de/docs/Libsumo.html)).

### Install SUMO-RL

Stable release version is available through pip

```bash

pip install sumo-rl

```

Alternatively, you can install using the latest (unreleased) version

```bash

git clone https://github.com/LucasAlegre/sumo-rl

cd sumo-rl

pip install -e .

```

## MDP - Observations, Actions and Rewards

### Observation

The default observation for each traffic signal agent is a vector:

```python

    obs = [phase_one_hot, min_green, lane_1_density,...,lane_n_density, lane_1_queue,...,lane_n_queue]

```

- ```phase_one_hot``` is a one-hot encoded vector indicating the current active green phase

- ```min_green``` is a binary variable indicating whether min_green seconds have already passed in the current phase

- ```lane_i_density``` is the number of vehicles in incoming lane i dividided by the total capacity of the lane

- ```lane_i_queue```is the number of queued (speed below 0.1 m/s) vehicles in incoming lane i divided by the total capacity of the lane

You can define your own observation by implementing a class that inherits from [ObservationFunction](https://github.com/LucasAlegre/sumo-rl/blob/main/sumo_rl/environment/observations.py) and passing it to the environment constructor.

### Action

The action space is discrete.

Every 'delta_time' seconds, each traffic signal agent can choose the next green phase configuration.

E.g.: In the [2-way single intersection](https://github.com/LucasAlegre/sumo-rl/blob/main/experiments/dqn_2way-single-intersection.py) there are |A| = 4 discrete actions, corresponding to the following green phase configurations:







Important: every time a phase change occurs, the next phase is preeceded by a yellow phase lasting ```yellow_time``` seconds.

### Rewards

The default reward function is the change in cumulative vehicle delay:







That is, the reward is how much the total delay (sum of the waiting times of all approaching vehicles) changed in relation to the previous time-step.

You can choose a different reward function (see the ones implemented in [TrafficSignal](https://github.com/LucasAlegre/sumo-rl/blob/main/sumo_rl/environment/traffic_signal.py)) with the parameter `reward_fn` in the [SumoEnvironment](https://github.com/LucasAlegre/sumo-rl/blob/main/sumo_rl/environment/env.py) constructor.

It is also possible to implement your own reward function:

```python

def my_reward_fn(traffic_signal):

    return traffic_signal.get_average_speed()

env = SumoEnvironment(..., reward_fn=my_reward_fn)

```

## API's (Gymnasium and PettingZoo)

### Gymnasium Single-Agent API

If your network only has ONE traffic light, then you can instantiate a standard Gymnasium env (see [Gymnasium API](https://gymnasium.farama.org/api/env/)):

```python

import gymnasium as gym

import sumo_rl

env = gym.make('sumo-rl-v0',

                net_file='path_to_your_network.net.xml',

                route_file='path_to_your_routefile.rou.xml',

                out_csv_name='path_to_output.csv',

                use_gui=True,

                num_seconds=100000)

obs, info = env.reset()

done = False

while not done:

    next_obs, reward, terminated, truncated, info = env.step(env.action_space.sample())

    done = terminated or truncated

```

### PettingZoo Multi-Agent API

For multi-agent environments, you can use the PettingZoo API (see [Petting Zoo API](https://pettingzoo.farama.org/api/parallel/)):

```python

import sumo_rl

env = sumo_rl.parallel_env(net_file='nets/RESCO/grid4x4/grid4x4.net.xml',

                  route_file='nets/RESCO/grid4x4/grid4x4_1.rou.xml',

                  use_gui=True,

                  num_seconds=3600)

observations = env.reset()

while env.agents:

    actions = {agent: env.action_space(agent).sample() for agent in env.agents}  # this is where you would insert your policy

    observations, rewards, terminations, truncations, infos = env.step(actions)

```

### RESCO Benchmarks

In the folder [nets/RESCO](https://github.com/LucasAlegre/sumo-rl/tree/main/nets/RESCO) you can find the network and route files from [RESCO](https://github.com/jault/RESCO) (Reinforcement Learning Benchmarks for Traffic Signal Control), which was built on top of SUMO-RL. See their [paper](https://people.engr.tamu.edu/guni/Papers/NeurIPS-signals.pdf) for results.







### Experiments

Check [experiments](https://github.com/LucasAlegre/sumo-rl/tree/main/experiments) for examples on how to instantiate an environment and train your RL agent.

### [Q-learning](https://github.com/LucasAlegre/sumo-rl/blob/main/agents/ql_agent.py) in a one-way single intersection:

```bash

python experiments/ql_single-intersection.py

```

### [RLlib PPO](https://docs.ray.io/en/latest/_modules/ray/rllib/algorithms/ppo/ppo.html) multiagent in a 4x4 grid:

```bash

python experiments/ppo_4x4grid.py

```

### [stable-baselines3 DQN](https://github.com/DLR-RM/stable-baselines3/blob/master/stable_baselines3/dqn/dqn.py) in a 2-way single intersection:

Obs: you need to install stable-baselines3 with ```pip install "stable_baselines3[extra]>=2.0.0a9"``` for [Gymnasium compatibility](https://stable-baselines3.readthedocs.io/en/master/guide/install.html).

```bash

python experiments/dqn_2way-single-intersection.py

```

### Plotting results:

```bash

python outputs/plot.py -f outputs/4x4grid/ppo_conn0_ep2

```







## Citing

If you use this repository in your research, please cite:

```bibtex

@misc{sumorl,

    author = {Lucas N. Alegre},

    title = {{SUMO-RL}},

    year = {2019},

    publisher = {GitHub},

    journal = {GitHub repository},

    howpublished = {\url{https://github.com/LucasAlegre/sumo-rl}},

}

```

List of publications that use SUMO-RL (please open a pull request to add missing entries):

- [Quantifying the impact of non-stationarity in reinforcement learning-based traffic signal control (Alegre et al., 2021)](https://peerj.com/articles/cs-575/)

- [Information-Theoretic State Space Model for Multi-View Reinforcement Learning (Hwang et al., 2023)](https://openreview.net/forum?id=jwy77xkyPt)

- [A citywide TD-learning based intelligent traffic signal control for autonomous vehicles: Performance evaluation using SUMO (Reza et al., 2023)](https://onlinelibrary.wiley.com/doi/full/10.1111/exsy.13301)

- [Handling uncertainty in self-adaptive systems: an ontology-based reinforcement learning model (Ghanadbashi et al., 2023)](https://link.springer.com/article/10.1007/s40860-022-00198-x)

- [Multiagent Reinforcement Learning for Traffic Signal Control: a k-Nearest Neighbors Based Approach (Almeida et al., 2022)](https://ceur-ws.org/Vol-3173/3.pdf)

- [From Local to Global: A Curriculum Learning Approach for Reinforcement Learning-based Traffic Signal Control (Zheng et al., 2022)](https://ieeexplore.ieee.org/abstract/document/9832372)

- [Poster: Reliable On-Ramp Merging via Multimodal Reinforcement Learning (Bagwe et al., 2022)](https://ieeexplore.ieee.org/abstract/document/9996639)

- [Using ontology to guide reinforcement learning agents in unseen situations (Ghanadbashi & Golpayegani, 2022)](https://link.springer.com/article/10.1007/s10489-021-02449-5)

- [Information upwards, recommendation downwards: reinforcement learning with hierarchy for traffic signal control (Antes et al., 2022)](https://www.sciencedirect.com/science/article/pii/S1877050922004185)

- [A Comparative Study of Algorithms for Intelligent Traffic Signal Control (Chaudhuri et al., 2022)](https://link.springer.com/chapter/10.1007/978-981-16-7996-4_19)

- [An Ontology-Based Intelligent Traffic Signal Control Model (Ghanadbashi & Golpayegani, 2021)](https://ieeexplore.ieee.org/abstract/document/9564962)

- [Reinforcement Learning Benchmarks for Traffic Signal Control (Ault & Sharon, 2021)](https://openreview.net/forum?id=LqRSh6V0vR)

- [EcoLight: Reward Shaping in Deep Reinforcement Learning for Ergonomic Traffic Signal Control (Agand et al., 2021)](https://s3.us-east-1.amazonaws.com/climate-change-ai/papers/neurips2021/43/paper.pdf)