{"id":50486769,"url":"https://github.com/poudel-bibek/urban-control","last_synced_at":"2026-06-01T23:02:18.498Z","repository":{"id":272303576,"uuid":"915480881","full_name":"poudel-bibek/Urban-Control","owner":"poudel-bibek","description":"Joint Pedestrian and Vehicle Traffic Optimization in Urban Environments using Reinforcement Learning","archived":false,"fork":false,"pushed_at":"2025-07-19T15:09:20.000Z","size":234869,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-07-19T19:11:10.209Z","etag":null,"topics":["optimization","pedestrian","ppo","real-world","reinforcement-learning","sumo","traffic","traffic-control","vehicles"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/poudel-bibek.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-01-12T00:20:03.000Z","updated_at":"2025-07-19T15:09:23.000Z","dependencies_parsed_at":"2025-01-13T15:45:01.835Z","dependency_job_id":"903ecbed-a49e-47c0-81d7-f67353a66c4c","html_url":"https://github.com/poudel-bibek/Urban-Control","commit_stats":null,"previous_names":["poudel-bibek/urban-control"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/poudel-bibek/Urban-Control","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poudel-bibek%2FUrban-Control","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poudel-bibek%2FUrban-Control/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poudel-bibek%2FUrban-Control/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poudel-bibek%2FUrban-Control/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/poudel-bibek","download_url":"https://codeload.github.com/poudel-bibek/Urban-Control/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poudel-bibek%2FUrban-Control/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33797128,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-01T02:00:06.963Z","response_time":115,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["optimization","pedestrian","ppo","real-world","reinforcement-learning","sumo","traffic","traffic-control","vehicles"],"created_at":"2026-06-01T23:02:16.485Z","updated_at":"2026-06-01T23:02:18.489Z","avatar_url":"https://github.com/poudel-bibek.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Joint Pedestrian and Vehicle Traffic Optimization in Urban Environments using Reinforcement Learning\n\n\u003ca href=\"https://arxiv.org/abs/2504.05018\"\u003e\u003cimg src=\"https://img.shields.io/badge/arXiv--green\"\u003e\u003c/a\u003e \u003ca href=\"https://youtu.be/Tec3H72cDT4\"\u003e\u003cimg src=\"https://img.shields.io/badge/Demo--red\"\u003e\u003c/a\u003e \u003ca href=\"https://www.youtube.com/watch?v=zMbu8zeQEFs\"\u003e\u003cimg src=\"https://img.shields.io/badge/Presentation--red\"\u003e\u003c/a\u003e\n\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://youtu.be/Tec3H72cDT4\"\u003e\u003cimg src=\"https://github.com/poudel-bibek/Urban-Control/blob/main/images/craver_3d.gif\" alt=\"craver\" style=\"width:800px\"/\u003e\u003c/a\u003e\n\n\n### 📌 Overview\n\nThis project uses Proximal Policy Optimization (PPO) to jointly optimize traffic signal control for pedestrians and vehicles along the Craver Road corridor, featuring one intersection (with four signalized crosswalks) and seven midblock crossings. Our approach reduces waiting times by up to 52% for vehicles and 67% for pedestrians compared to traditional fixed-time signal control. \n\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://github.com/poudel-bibek/Urban-Control/blob/main/images/system_overview.png\" alt=\"System Overview\" style=\"width:800px\"/\u003e\n  \u003cbr\u003e\n  \u003cem\u003e(a) System overview and (b) agent actions in the Intersection and (c) Mid-Block crossings.\u003c/em\u003e\n\u003c/p\u003e\n\n #### 📂 Project Structure\n\n-  `simulation/`: Contains SUMO network, configuration and trip files\n-  `simulation/env.py`: Python-SUMO interface\n-  `ppo/ppo_alg.py`: PPO algorithm implementation (with parallelized sumo environments)\n-  `ppo/models.py`: Policy network architectures\n-  `main.py`: Main script for running experiments\n-  `config.py`: Setup for training, evaluation\n-  `sweep.py`: Hyperparameter tuning using wandb sweep\n\n---\n\n### 📊 Data\n- [Training logs in wandb](https://api.wandb.ai/links/Fluidic-city/kt1tlg8f) \n- [Trained policy](https://github.com/poudel-bibek/Urban-Control/blob/main/saved_models) and [config file](https://github.com/user-attachments/files/19145941/config_Feb24_19-06-53.json)\n- Unscaled trips: [pedestrian](https://github.com/poudel-bibek/Urban-Control/blob/main/simulation/original_pedtrips.xml), [vehicle](https://github.com/poudel-bibek/Urban-Control/blob/main/simulation/original_vehtrips.xml)\n- Results reported in the paper (json files): [Traffic Signal](https://github.com/user-attachments/files/19145910/eval_tl.json), [Unsignalized](https://github.com/user-attachments/files/19145909/eval_unsignalized.json), [RL](https://github.com/user-attachments/files/19145911/eval_ppo.json).\n- Rollout videos:\n\t| Method |Demand (1x)  | Demand (2.5x)| \n\t|--|--|--|\n\t| Unsignalized | [link](https://youtu.be/XWkNqePOXPo) | [link](https://youtu.be/VC9E25Ys5RY) |\n\t| Signalized | [link](https://youtu.be/j9cxdP3pj_c) | [link](https://youtu.be/JaxmSJG-B5E) |\n\t| RL (Ours) | [link](https://youtu.be/80-0g7RuBIg)  | [link](https://youtu.be/HHrltmck6l8) |\n\n---\n\n### 🛠️ Setup \u0026 Training:\n\n #### 📋 Requirements:\n- SUMO version: [1.21](https://github.com/eclipse-sumo/sumo/releases/tag/v1_21_0)\n- Python version: 3.12 ([Anaconda 2024.06](https://repo.anaconda.com/archive/) recommended)\n- Install required packages\n\n\t```bash\n\tpip install -r requirements.txt\n\t```\n\n ---\n #### 🏋️ Training:\n\n- Step 1: Complete the setup\n\n- Step 2: Open terminal, in linux or wsl run:\n\n\t```bash\t\n\tulimit -n 20000\n\t```\n\n to increase the limit on the number of file descriptors that can be opened by a process.\n\n- Step 3: In the [config.py](https://github.com/poudel-bibek/Urban-Control/blob/main/config.py) file, set the `sweep`,`evaluate`, and `gui` to `False`.\n\n- Step 4: Run the following command:\n\n\t```bash\n\tpython main.py\n\t```\n\n- Step 5: To view tensorboard logs, run the following command:\n\n\t```bash\n\ttensorboard --logdir=./runs\n\t```\n\n#### Some important parameters that you can change in the [config.py](https://github.com/poudel-bibek/Urban-Control/blob/main/config.py) file during training:\n-  `gui: True` to run the simulation with GUI.\n-  `gpu: True` to run the simulation on GPU.\n-  `sweep: True` to run the hyperparameter tuning.\n-  `evaluate: True` to evaluate a trained policy.\n-  `\"step_length\"`: Real-world time in seconds per simulation timestep (default: 1.0)\n-  `\"action_duration\"`: Number of simulation timesteps for each action (default: 10)\n-  `\"total_timesteps\"`: Total training timesteps (default: 8000000)\n-  `\"max_timesteps\"`: Maximum simulation steps per episode (default: 600)\n-  `\"num_processes\"`: Number of parallel processes (default: 8). Increase/ reduce this according to your CPU.\n\n ---\n #### 📈 Evaluation and Benchmarks \n\n- Set `eval_model_path` path in the [config.py](https://github.com/poudel-bibek/Urban-Control/blob/main/config.py) file. Modify other evaluation parameters as needed.\n- Set `evaluate: True` in the [config.py](https://github.com/poudel-bibek/Urban-Control/blob/main/config.py) file.\n- Run the following command:\n\n\t```bash\n\tpython main.py\n\t```\n\n- It will run benchmarks in the order: RL, Traffic Signal and Unsignalized as defined in the main.py file. If you want to run a specific benchmark, comment out the other two.\n\n\t```python\n\tppo_results_path = eval(control_args, ppo_args, eval_args, policy_path=config['eval_model_path'], tl= False)\n\ttl_results_path = eval(control_args, ppo_args, eval_args, policy_path=None, tl= True, unsignalized=False) \n\tunsignalized_results_path = eval(control_args, ppo_args, eval_args, policy_path=None, tl= True, unsignalized=True)\n\t```\n\n- Benchmark results json files are saved in the `results` folder.\n\n ---\n #### 🔍 Hyperparameter sweep\n- Set `sweep: True` in the [config.py](https://github.com/poudel-bibek/Urban-Control/blob/main/config.py) file.\n- Modify the `create_sweep_config` method in [sweep.py](https://github.com/poudel-bibek/Urban-Control/blob/main/sweep.py) to set the parameters/ method to use.\n- Run the following command:\n\n\t```bash\n\tpython main.py\n\t```\n- Create a [wandb account](https://wandb.ai/site) and [login/ authorize](https://wandb.ai/authorize). \n- You will also have to setup a project and set in the name in `self.project` in [sweep.py](https://github.com/poudel-bibek/Urban-Control/blob/main/sweep.py)\n\n---\n### 📝 Notes: \n- The initial `100-250` timesteps (randomly chosen) are warmup period. Defined in the `reset` method in [env.py](https://github.com/poudel-bibek/Urban-Control/blob/main/simulation/env.py)\n- Although when episode horizon is same, rollouts for higher demands take longer because of higher CPU load.\n- This code was developed and tested on Ubuntu 24.04 and Windows 11 + WSL2.\n- ⚠️ If something fails, check the `sumo_logfile.txt` and `sumo_errorlog.txt` files in the `simulation` folder.\n\n---\n### 📖 Citation\nIf you find this work useful in your own research:\n```\n@inproceedings{poudel2025control,\n  title={Joint Pedestrian and Vehicle Traffic Optimization in Urban Environments using Reinforcement Learning},\n  author={Poudel, Bibek and Wang, Xuan and Li, Weizi and Zhu, Lei and Heaslip, Kevin},\n  journal={2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},\n  year={2025},\n  url={https://arxiv.org/abs/2504.05018},\n}\n\n```\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpoudel-bibek%2Furban-control","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpoudel-bibek%2Furban-control","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpoudel-bibek%2Furban-control/lists"}