{"id":31223713,"url":"https://github.com/mit-acl/rl_collision_avoidance","last_synced_at":"2025-09-21T22:54:58.491Z","repository":{"id":39223306,"uuid":"286258202","full_name":"mit-acl/rl_collision_avoidance","owner":"mit-acl","description":"Training code for GA3C-CADRL algorithm (collision avoidance with deep RL)","archived":false,"fork":false,"pushed_at":"2022-07-08T03:04:39.000Z","size":93,"stargazers_count":110,"open_issues_count":0,"forks_count":27,"subscribers_count":6,"default_branch":"release","last_synced_at":"2024-04-20T20:55:21.888Z","etag":null,"topics":["collision-avoidance","deep-reinforcement-learning","multiagent","rl","robotics"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mit-acl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-08-09T14:55:46.000Z","updated_at":"2024-04-11T12:09:21.000Z","dependencies_parsed_at":"2022-07-11T21:01:53.150Z","dependency_job_id":null,"html_url":"https://github.com/mit-acl/rl_collision_avoidance","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/mit-acl/rl_collision_avoidance","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mit-acl%2Frl_collision_avoidance","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mit-acl%2Frl_collision_avoidance/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mit-acl%2Frl_collision_avoidance/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mit-acl%2Frl_collision_avoidance/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mit-acl","download_url":"https://codeload.github.com/mit-acl/rl_collision_avoidance/tar.gz/refs/heads/release","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mit-acl%2Frl_collision_avoidance/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":276318988,"owners_count":25621651,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-21T02:00:07.055Z","response_time":72,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["collision-avoidance","deep-reinforcement-learning","multiagent","rl","robotics"],"created_at":"2025-09-21T22:54:56.954Z","updated_at":"2025-09-21T22:54:58.484Z","avatar_url":"https://github.com/mit-acl.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"This is the training code for:\n\n**Journal Version:** M. Everett, Y. Chen, and J. P. How, \"Collision Avoidance in Pedestrian-Rich Environments with Deep Reinforcement Learning\", in review, [Link to Paper](https://arxiv.org/abs/1910.11689)\n\n**Conference Version:** M. Everett, Y. Chen, and J. P. How, \"Motion Planning Among Dynamic, Decision-Making Agents with Deep Reinforcement Learning\", IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018. [Link to Paper](https://arxiv.org/abs/1805.01956), [Link to Video](https://www.youtube.com/watch?v=XHoXkWLhwYQ)\n\nThe [gym environment code](https://github.com/mit-acl/gym-collision-avoidance) is included as a submodule.\n\n### Install\n\nGrab the code from github, initialize submodules, install dependencies and src code\n```bash\n# Clone either through SSH or HTTPS (MIT-ACL users should use GitLab origin)\ngit clone --recursive git@github.com:mit-acl/rl_collision_avoidance.git\n\ncd rl_collision_avoidance\n./install.sh\n```\n\nThere are some moderately large (10s of MB) checkpoint files containing network weights that are stored in this repo as Git LFS files. They should automatically be downloaded during the install script. \n\n### Train RL (starting with a network initialized through supervised learning on CADRL decisions)\n\nTo start a GA3C training run (it should get approx -0.05-0.05 rolling reward to start):\n```bash\n./train.sh TrainPhase1\n```\n\n\u003c!-- It should produce an output stream somewhat like this:\n\u003cimg src=\"docs/_static/terminal_train_phase_1.gif\" alt=\"Example output of terminal\"\u003e\n --\u003e\n\nTo load that checkpoint and continue phase 2 of training, update the `LOAD_FROM_WANDB_RUN_ID` path in `Config.py` and do:\n```bash\n./train.sh TrainPhase2\n```\n\nBy default, the RL checkpoints will be stored in `RL_tmp` and I think files will get overwritten if you train multiple runs. Instead, I like using `wandb` as a way of recording experiments/saving network parameters. To enable this, set the `self.USE_WANDB` flag to be `True` in `Config.py`, then checkpoints will be stored in `RL/wandb/run-\u003cdatetime\u003e-\u003cid\u003e`.\n\n### To run experiments on AWS\nStart a bunch (e.g., 5) of AWS instances -- I used `c5.2xlarge` because they have 8vCPUs and 16GB RAM (somewhat like my desktop?). Note: this is just an example, won't work out of box for you (has hard-coded paths)\n\nAdd the IP addresses into `ga3c_cadrl_aws.sh`.\n```bash\n./ga3c_cadrl_aws.sh panes\n# C-a :setw synchronize-panes -- will let you enter the same command in each instance\n```\n\nThen you can follow the install \u0026 train instructions just like normal. When training, it will prompt you for a wandb login (can paste in the authorization code from app.wandb.ai/authorize).\n\n### Observed Issues\nIf on OSX, when running the `./train.sh` script, you see:\n```bash\nobjc[39391]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.\nobjc[39391]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.\n```\njust add this ENV_VAR: `export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES`.\n\n### If you find this code useful, please consider citing:\n\n```\n@inproceedings{Everett18_IROS,\n  address = {Madrid, Spain},\n  author = {Everett, Michael and Chen, Yu Fan and How, Jonathan P.},\n  booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},\n  date-modified = {2018-10-03 06:18:08 -0400},\n  month = sep,\n  title = {Motion Planning Among Dynamic, Decision-Making Agents with Deep Reinforcement Learning},\n  year = {2018},\n  url = {https://arxiv.org/pdf/1805.01956.pdf},\n  bdsk-url-1 = {https://arxiv.org/pdf/1805.01956.pdf}\n}\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmit-acl%2Frl_collision_avoidance","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmit-acl%2Frl_collision_avoidance","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmit-acl%2Frl_collision_avoidance/lists"}