
An open API service indexing awesome lists of open source software.

Reinforcement Learning environments for Traffic Signal Control with SUMO. Compatible with Gymnasium, PettingZoo, and popular RL libraries.

deep-reinforcement-learning gym gym-env gymnasium machine-learning pettingzoo python reinforcement-learning rl-algorithms sumo traffic-signal-control

Last synced: about 1 month ago
JSON representation

Reinforcement Learning environments for Traffic Signal Control with SUMO. Compatible with Gymnasium, PettingZoo, and popular RL libraries.




[![PyPI version](](
[![Code style: black](](


SUMO-RL provides a simple interface to instantiate Reinforcement Learning (RL) environments with [SUMO]( for Traffic Signal Control.

Goals of this repository:
- Provide a simple interface to work with Reinforcement Learning for Traffic Signal Control using SUMO
- Support Multiagent RL
- Compatibility with gymnasium.Env and popular RL libraries such as [stable-baselines3]( and [RLlib](
- Easy customisation: state and reward definitions are easily modifiable

The main class is [SumoEnvironment](
If instantiated with parameter 'single-agent=True', it behaves like a regular [Gymnasium Env](
For multiagent environments, use [env]( or [parallel_env]( to instantiate a [PettingZoo]( environment with AEC or Parallel API, respectively.
[TrafficSignal]( is responsible for retrieving information and actuating on traffic lights using [TraCI]( API.

For more details, check the [documentation online](

## Install

### Install SUMO latest version:

sudo add-apt-repository ppa:sumo/stable
sudo apt-get update
sudo apt-get install sumo sumo-tools sumo-doc
Don't forget to set SUMO_HOME variable (default sumo installation path is /usr/share/sumo)
echo 'export SUMO_HOME="/usr/share/sumo"' >> ~/.bashrc
source ~/.bashrc
Important: for a huge performance boost (~8x) with Libsumo, you can declare the variable:
Notice that you will not be able to run with sumo-gui or with multiple simulations in parallel if this is active ([more details](

### Install SUMO-RL

Stable release version is available through pip
pip install sumo-rl

Alternatively, you can install using the latest (unreleased) version
git clone
cd sumo-rl
pip install -e .

## MDP - Observations, Actions and Rewards

### Observation

The default observation for each traffic signal agent is a vector:
obs = [phase_one_hot, min_green, lane_1_density,...,lane_n_density, lane_1_queue,...,lane_n_queue]
- ```phase_one_hot``` is a one-hot encoded vector indicating the current active green phase
- ```min_green``` is a binary variable indicating whether min_green seconds have already passed in the current phase
- ```lane_i_density``` is the number of vehicles in incoming lane i dividided by the total capacity of the lane
- ```lane_i_queue```is the number of queued (speed below 0.1 m/s) vehicles in incoming lane i divided by the total capacity of the lane

You can define your own observation by implementing a class that inherits from [ObservationFunction]( and passing it to the environment constructor.

### Action

The action space is discrete.
Every 'delta_time' seconds, each traffic signal agent can choose the next green phase configuration.

E.g.: In the [2-way single intersection]( there are |A| = 4 discrete actions, corresponding to the following green phase configurations:

Important: every time a phase change occurs, the next phase is preeceded by a yellow phase lasting ```yellow_time``` seconds.

### Rewards

The default reward function is the change in cumulative vehicle delay:

That is, the reward is how much the total delay (sum of the waiting times of all approaching vehicles) changed in relation to the previous time-step.

You can choose a different reward function (see the ones implemented in [TrafficSignal]( with the parameter `reward_fn` in the [SumoEnvironment]( constructor.

It is also possible to implement your own reward function:

def my_reward_fn(traffic_signal):
return traffic_signal.get_average_speed()

env = SumoEnvironment(..., reward_fn=my_reward_fn)

## API's (Gymnasium and PettingZoo)

### Gymnasium Single-Agent API

If your network only has ONE traffic light, then you can instantiate a standard Gymnasium env (see [Gymnasium API](
import gymnasium as gym
import sumo_rl
env = gym.make('sumo-rl-v0',
obs, info = env.reset()
done = False
while not done:
next_obs, reward, terminated, truncated, info = env.step(env.action_space.sample())
done = terminated or truncated

### PettingZoo Multi-Agent API

For multi-agent environments, you can use the PettingZoo API (see [Petting Zoo API](

import sumo_rl
env = sumo_rl.parallel_env(net_file='nets/RESCO/grid4x4/',
observations = env.reset()
while env.agents:
actions = {agent: env.action_space(agent).sample() for agent in env.agents} # this is where you would insert your policy
observations, rewards, terminations, truncations, infos = env.step(actions)

### RESCO Benchmarks

In the folder [nets/RESCO]( you can find the network and route files from [RESCO]( (Reinforcement Learning Benchmarks for Traffic Signal Control), which was built on top of SUMO-RL. See their [paper]( for results.

### Experiments

Check [experiments]( for examples on how to instantiate an environment and train your RL agent.

### [Q-learning]( in a one-way single intersection:
python experiments/

### [RLlib PPO]( multiagent in a 4x4 grid:
python experiments/

### [stable-baselines3 DQN]( in a 2-way single intersection:
Obs: you need to install stable-baselines3 with ```pip install "stable_baselines3[extra]>=2.0.0a9"``` for [Gymnasium compatibility](
python experiments/

### Plotting results:
python outputs/ -f outputs/4x4grid/ppo_conn0_ep2

## Citing

If you use this repository in your research, please cite:
author = {Lucas N. Alegre},
title = {{SUMO-RL}},
year = {2019},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{}},

List of publications that use SUMO-RL (please open a pull request to add missing entries):
- [Quantifying the impact of non-stationarity in reinforcement learning-based traffic signal control (Alegre et al., 2021)](
- [Information-Theoretic State Space Model for Multi-View Reinforcement Learning (Hwang et al., 2023)](
- [A citywide TD-learning based intelligent traffic signal control for autonomous vehicles: Performance evaluation using SUMO (Reza et al., 2023)](
- [Handling uncertainty in self-adaptive systems: an ontology-based reinforcement learning model (Ghanadbashi et al., 2023)](
- [Multiagent Reinforcement Learning for Traffic Signal Control: a k-Nearest Neighbors Based Approach (Almeida et al., 2022)](
- [From Local to Global: A Curriculum Learning Approach for Reinforcement Learning-based Traffic Signal Control (Zheng et al., 2022)](
- [Poster: Reliable On-Ramp Merging via Multimodal Reinforcement Learning (Bagwe et al., 2022)](
- [Using ontology to guide reinforcement learning agents in unseen situations (Ghanadbashi & Golpayegani, 2022)](
- [Information upwards, recommendation downwards: reinforcement learning with hierarchy for traffic signal control (Antes et al., 2022)](
- [A Comparative Study of Algorithms for Intelligent Traffic Signal Control (Chaudhuri et al., 2022)](
- [An Ontology-Based Intelligent Traffic Signal Control Model (Ghanadbashi & Golpayegani, 2021)](
- [Reinforcement Learning Benchmarks for Traffic Signal Control (Ault & Sharon, 2021)](
- [EcoLight: Reward Shaping in Deep Reinforcement Learning for Ergonomic Traffic Signal Control (Agand et al., 2021)](