{"id":17326063,"url":"https://github.com/lucasalegre/sumo-rl","last_synced_at":"2025-05-14T08:06:26.468Z","repository":{"id":37735210,"uuid":"161216111","full_name":"LucasAlegre/sumo-rl","owner":"LucasAlegre","description":"Reinforcement Learning environments for Traffic Signal Control with SUMO. Compatible with Gymnasium, PettingZoo, and popular RL libraries.","archived":false,"fork":false,"pushed_at":"2025-02-19T19:51:16.000Z","size":43223,"stargazers_count":841,"open_issues_count":18,"forks_count":218,"subscribers_count":11,"default_branch":"main","last_synced_at":"2025-05-11T14:44:04.976Z","etag":null,"topics":["deep-reinforcement-learning","gym","gym-env","gymnasium","machine-learning","pettingzoo","python","reinforcement-learning","rl-algorithms","sumo","traffic-signal-control"],"latest_commit_sha":null,"homepage":"https://lucasalegre.github.io/sumo-rl","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LucasAlegre.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.bib","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-12-10T18:04:33.000Z","updated_at":"2025-05-10T20:15:42.000Z","dependencies_parsed_at":"2022-07-16T10:30:41.791Z","dependency_job_id":"c16502cd-77a1-448b-9199-852886a13fe0","html_url":"https://github.com/LucasAlegre/sumo-rl","commit_stats":{"total_commits":192,"total_committers":7,"mean_commits":"27.428571428571427","dds":0.078125,"last_synced_commit":"a134ebbb9be0518204d21a4ad162b9e828d164e5"},"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LucasAlegre%2Fsumo-rl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LucasAlegre%2Fsumo-rl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LucasAlegre%2Fsumo-rl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LucasAlegre%2Fsumo-rl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LucasAlegre","download_url":"https://codeload.github.com/LucasAlegre/sumo-rl/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254101616,"owners_count":22014909,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-reinforcement-learning","gym","gym-env","gymnasium","machine-learning","pettingzoo","python","reinforcement-learning","rl-algorithms","sumo","traffic-signal-control"],"created_at":"2024-10-15T14:14:56.018Z","updated_at":"2025-05-14T08:06:21.443Z","avatar_url":"https://github.com/LucasAlegre.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cimg src=\"docs/_static/logo.png\" align=\"right\" width=\"30%\"/\u003e\n\n[![DOI](https://zenodo.org/badge/161216111.svg)](https://zenodo.org/doi/10.5281/zenodo.10869789)\n[![tests](https://github.com/LucasAlegre/sumo-rl/actions/workflows/linux-test.yml/badge.svg)](https://github.com/LucasAlegre/sumo-rl/actions/workflows/linux-test.yml)\n[![PyPI version](https://badge.fury.io/py/sumo-rl.svg)](https://badge.fury.io/py/sumo-rl)\n[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit\u0026logoColor=white)](https://pre-commit.com/)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![License](http://img.shields.io/badge/license-MIT-brightgreen.svg?style=flat)](https://github.com/LucasAlegre/sumo-rl/blob/main/LICENSE)\n\n# SUMO-RL\n\n\u003c!-- start intro --\u003e\n\nSUMO-RL provides a simple interface to instantiate Reinforcement Learning (RL) environments with [SUMO](https://github.com/eclipse/sumo) for Traffic Signal Control.\n\nGoals of this repository:\n- Provide a simple interface to work with Reinforcement Learning for Traffic Signal Control using SUMO\n- Support Multiagent RL\n- Compatibility with gymnasium.Env and popular RL libraries such as [stable-baselines3](https://github.com/DLR-RM/stable-baselines3) and [RLlib](https://docs.ray.io/en/main/rllib.html)\n- Easy customisation: state and reward definitions are easily modifiable\n\nThe main class is [SumoEnvironment](https://github.com/LucasAlegre/sumo-rl/blob/main/sumo_rl/environment/env.py).\nIf instantiated with parameter 'single-agent=True', it behaves like a regular [Gymnasium Env](https://github.com/Farama-Foundation/Gymnasium).\nFor multiagent environments, use [env](https://github.com/LucasAlegre/sumo-rl/blob/main/sumo_rl/environment/env.py) or [parallel_env](https://github.com/LucasAlegre/sumo-rl/blob/main/sumo_rl/environment/env.py) to instantiate a [PettingZoo](https://github.com/PettingZoo-Team/PettingZoo) environment with AEC or Parallel API, respectively.\n[TrafficSignal](https://github.com/LucasAlegre/sumo-rl/blob/main/sumo_rl/environment/traffic_signal.py) is responsible for retrieving information and actuating on traffic lights using [TraCI](https://sumo.dlr.de/wiki/TraCI) API.\n\nFor more details, check the [documentation online](https://lucasalegre.github.io/sumo-rl/).\n\n\u003c!-- end intro --\u003e\n\n## Install\n\n\u003c!-- start install --\u003e\n\n### Install SUMO latest version:\n\n```bash\nsudo add-apt-repository ppa:sumo/stable\nsudo apt-get update\nsudo apt-get install sumo sumo-tools sumo-doc\n```\nDon't forget to set SUMO_HOME variable (default sumo installation path is /usr/share/sumo)\n```bash\necho 'export SUMO_HOME=\"/usr/share/sumo\"' \u003e\u003e ~/.bashrc\nsource ~/.bashrc\n```\nImportant: for a huge performance boost (~8x) with Libsumo, you can declare the variable:\n```bash\nexport LIBSUMO_AS_TRACI=1\n```\nNotice that you will not be able to run with sumo-gui or with multiple simulations in parallel if this is active ([more details](https://sumo.dlr.de/docs/Libsumo.html)).\n\n### Install SUMO-RL\n\nStable release version is available through pip\n```bash\npip install sumo-rl\n```\n\nAlternatively, you can install using the latest (unreleased) version\n```bash\ngit clone https://github.com/LucasAlegre/sumo-rl\ncd sumo-rl\npip install -e .\n```\n\n\u003c!-- end install --\u003e\n\n## MDP - Observations, Actions and Rewards\n\n### Observation\n\n\u003c!-- start observation --\u003e\n\nThe default observation for each traffic signal agent is a vector:\n```python\n    obs = [phase_one_hot, min_green, lane_1_density,...,lane_n_density, lane_1_queue,...,lane_n_queue]\n```\n- ```phase_one_hot``` is a one-hot encoded vector indicating the current active green phase\n- ```min_green``` is a binary variable indicating whether min_green seconds have already passed in the current phase\n- ```lane_i_density``` is the number of vehicles in incoming lane i dividided by the total capacity of the lane\n- ```lane_i_queue```is the number of queued (speed below 0.1 m/s) vehicles in incoming lane i divided by the total capacity of the lane\n\nYou can define your own observation by implementing a class that inherits from [ObservationFunction](https://github.com/LucasAlegre/sumo-rl/blob/main/sumo_rl/environment/observations.py) and passing it to the environment constructor.\n\n\u003c!-- end observation --\u003e\n\n### Action\n\n\u003c!-- start action --\u003e\n\nThe action space is discrete.\nEvery 'delta_time' seconds, each traffic signal agent can choose the next green phase configuration.\n\nE.g.: In the [2-way single intersection](https://github.com/LucasAlegre/sumo-rl/blob/main/experiments/dqn_2way-single-intersection.py) there are |A| = 4 discrete actions, corresponding to the following green phase configurations:\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"docs/_static/actions.png\" align=\"center\" width=\"75%\"/\u003e\n\u003c/p\u003e\n\nImportant: every time a phase change occurs, the next phase is preeceded by a yellow phase lasting ```yellow_time``` seconds.\n\n\u003c!-- end action --\u003e\n\n### Rewards\n\n\u003c!-- start reward --\u003e\n\nThe default reward function is the change in cumulative vehicle delay:\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"docs/_static/reward.png\" align=\"center\" width=\"25%\"/\u003e\n\u003c/p\u003e\n\nThat is, the reward is how much the total delay (sum of the waiting times of all approaching vehicles) changed in relation to the previous time-step.\n\nYou can choose a different reward function (see the ones implemented in [TrafficSignal](https://github.com/LucasAlegre/sumo-rl/blob/main/sumo_rl/environment/traffic_signal.py)) with the parameter `reward_fn` in the [SumoEnvironment](https://github.com/LucasAlegre/sumo-rl/blob/main/sumo_rl/environment/env.py) constructor.\n\nIt is also possible to implement your own reward function:\n\n```python\ndef my_reward_fn(traffic_signal):\n    return traffic_signal.get_average_speed()\n\nenv = SumoEnvironment(..., reward_fn=my_reward_fn)\n```\n\n\u003c!-- end reward --\u003e\n\n## API's (Gymnasium and PettingZoo)\n\n### Gymnasium Single-Agent API\n\n\u003c!-- start gymnasium --\u003e\n\nIf your network only has ONE traffic light, then you can instantiate a standard Gymnasium env (see [Gymnasium API](https://gymnasium.farama.org/api/env/)):\n```python\nimport gymnasium as gym\nimport sumo_rl\nenv = gym.make('sumo-rl-v0',\n                net_file='path_to_your_network.net.xml',\n                route_file='path_to_your_routefile.rou.xml',\n                out_csv_name='path_to_output.csv',\n                use_gui=True,\n                num_seconds=100000)\nobs, info = env.reset()\ndone = False\nwhile not done:\n    next_obs, reward, terminated, truncated, info = env.step(env.action_space.sample())\n    done = terminated or truncated\n```\n\n\u003c!-- end gymnasium --\u003e\n\n### PettingZoo Multi-Agent API\n\n\u003c!-- start pettingzoo --\u003e\n\nFor multi-agent environments, you can use the PettingZoo API (see [Petting Zoo API](https://pettingzoo.farama.org/api/parallel/)):\n\n```python\nimport sumo_rl\nenv = sumo_rl.parallel_env(net_file='nets/RESCO/grid4x4/grid4x4.net.xml',\n                  route_file='nets/RESCO/grid4x4/grid4x4_1.rou.xml',\n                  use_gui=True,\n                  num_seconds=3600)\nobservations = env.reset()\nwhile env.agents:\n    actions = {agent: env.action_space(agent).sample() for agent in env.agents}  # this is where you would insert your policy\n    observations, rewards, terminations, truncations, infos = env.step(actions)\n```\n\n\u003c!-- end pettingzoo --\u003e\n\n### RESCO Benchmarks\n\nIn the folder [nets/RESCO](https://github.com/LucasAlegre/sumo-rl/tree/main/sumo_rl/nets/RESCO) you can find the network and route files from [RESCO](https://github.com/jault/RESCO) (Reinforcement Learning Benchmarks for Traffic Signal Control), which was built on top of SUMO-RL. See their [paper](https://people.engr.tamu.edu/guni/Papers/NeurIPS-signals.pdf) for results.\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"sumo_rl/nets/RESCO/maps.png\" align=\"center\" width=\"60%\"/\u003e\n\u003c/p\u003e\n\n### Experiments\n\nCheck [experiments](https://github.com/LucasAlegre/sumo-rl/tree/main/experiments) for examples on how to instantiate an environment and train your RL agent.\n\n### [Q-learning](https://github.com/LucasAlegre/sumo-rl/blob/main/agents/ql_agent.py) in a one-way single intersection:\n```bash\npython experiments/ql_single-intersection.py\n```\n\n### [RLlib PPO](https://docs.ray.io/en/latest/_modules/ray/rllib/algorithms/ppo/ppo.html) multiagent in a 4x4 grid:\n```bash\npython experiments/ppo_4x4grid.py\n```\n\n### [stable-baselines3 DQN](https://github.com/DLR-RM/stable-baselines3/blob/master/stable_baselines3/dqn/dqn.py) in a 2-way single intersection:\nObs: you need to install stable-baselines3 with ```pip install \"stable_baselines3[extra]\u003e=2.0.0a9\"``` for [Gymnasium compatibility](https://stable-baselines3.readthedocs.io/en/master/guide/install.html).\n```bash\npython experiments/dqn_2way-single-intersection.py\n```\n\n### Plotting results:\n```bash\npython outputs/plot.py -f outputs/4x4grid/ppo_conn0_ep2\n```\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"outputs/result.png\" align=\"center\" width=\"50%\"/\u003e\n\u003c/p\u003e\n\n## Citing\n\n\u003c!-- start citation --\u003e\n\nIf you use this repository in your research, please cite:\n```bibtex\n@misc{sumorl,\n    author = {Lucas N. Alegre},\n    title = {{SUMO-RL}},\n    year = {2019},\n    publisher = {GitHub},\n    journal = {GitHub repository},\n    howpublished = {\\url{https://github.com/LucasAlegre/sumo-rl}},\n}\n```\n\n\u003c!-- end citation --\u003e\n\n\u003c!-- start list of publications --\u003e\n\nList of publications that use SUMO-RL (please open a pull request to add missing entries):\n- [Quantifying the impact of non-stationarity in reinforcement learning-based traffic signal control (Alegre et al., 2021)](https://peerj.com/articles/cs-575/)\n- [Information-Theoretic State Space Model for Multi-View Reinforcement Learning (Hwang et al., 2023)](https://openreview.net/forum?id=jwy77xkyPt)\n- [A citywide TD-learning based intelligent traffic signal control for autonomous vehicles: Performance evaluation using SUMO (Reza et al., 2023)](https://onlinelibrary.wiley.com/doi/full/10.1111/exsy.13301)\n- [Handling uncertainty in self-adaptive systems: an ontology-based reinforcement learning model (Ghanadbashi et al., 2023)](https://link.springer.com/article/10.1007/s40860-022-00198-x)\n- [Multiagent Reinforcement Learning for Traffic Signal Control: a k-Nearest Neighbors Based Approach (Almeida et al., 2022)](https://ceur-ws.org/Vol-3173/3.pdf)\n- [From Local to Global: A Curriculum Learning Approach for Reinforcement Learning-based Traffic Signal Control (Zheng et al., 2022)](https://ieeexplore.ieee.org/abstract/document/9832372)\n- [Poster: Reliable On-Ramp Merging via Multimodal Reinforcement Learning (Bagwe et al., 2022)](https://ieeexplore.ieee.org/abstract/document/9996639)\n- [Using ontology to guide reinforcement learning agents in unseen situations (Ghanadbashi \u0026 Golpayegani, 2022)](https://link.springer.com/article/10.1007/s10489-021-02449-5)\n- [Information upwards, recommendation downwards: reinforcement learning with hierarchy for traffic signal control (Antes et al., 2022)](https://www.sciencedirect.com/science/article/pii/S1877050922004185)\n- [A Comparative Study of Algorithms for Intelligent Traffic Signal Control (Chaudhuri et al., 2022)](https://link.springer.com/chapter/10.1007/978-981-16-7996-4_19)\n- [An Ontology-Based Intelligent Traffic Signal Control Model (Ghanadbashi \u0026 Golpayegani, 2021)](https://ieeexplore.ieee.org/abstract/document/9564962)\n- [Reinforcement Learning Benchmarks for Traffic Signal Control (Ault \u0026 Sharon, 2021)](https://openreview.net/forum?id=LqRSh6V0vR)\n- [EcoLight: Reward Shaping in Deep Reinforcement Learning for Ergonomic Traffic Signal Control (Agand et al., 2021)](https://s3.us-east-1.amazonaws.com/climate-change-ai/papers/neurips2021/43/paper.pdf)\n\n\u003c!-- end list of publications --\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucasalegre%2Fsumo-rl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flucasalegre%2Fsumo-rl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucasalegre%2Fsumo-rl/lists"}