{"id":24396812,"url":"https://github.com/ai4ce/snac","last_synced_at":"2025-08-23T02:30:54.858Z","repository":{"id":59259943,"uuid":"353099049","full_name":"ai4ce/SNAC","owner":"ai4ce","description":"[ICLR2023] Learning Simultaneous Navigation and Construction in Grid Worlds","archived":false,"fork":false,"pushed_at":"2023-09-07T01:01:27.000Z","size":398376,"stargazers_count":20,"open_issues_count":0,"forks_count":4,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-04-11T16:59:58.586Z","etag":null,"topics":["benchmark","construction","mobile-manipulation","navigation","reinforcement-learning-environments","robotics"],"latest_commit_sha":null,"homepage":"https://ai4ce.github.io/SNAC","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ai4ce.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2021-03-30T18:15:17.000Z","updated_at":"2024-12-13T06:18:51.000Z","dependencies_parsed_at":"2025-04-11T17:02:53.270Z","dependency_job_id":null,"html_url":"https://github.com/ai4ce/SNAC","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ai4ce/SNAC","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ai4ce%2FSNAC","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ai4ce%2FSNAC/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ai4ce%2FSNAC/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ai4ce%2FSNAC/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ai4ce","download_url":"https://codeload.github.com/ai4ce/SNAC/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ai4ce%2FSNAC/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271732368,"owners_count":24811310,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-23T02:00:09.327Z","response_time":69,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmark","construction","mobile-manipulation","navigation","reinforcement-learning-environments","robotics"],"created_at":"2025-01-19T21:58:33.796Z","updated_at":"2025-08-23T02:30:54.810Z","avatar_url":"https://github.com/ai4ce.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Learning Simultaneous Navigation and Construction in Grid Worlds\n\n[**Wenyu Han**](https://www.linkedin.com/in/wenyuhan0616), [**Haoran Wu**](https://www.linkedin.com/in/haoran-lucas-ng-4053471a0/), [**Eisuke Hirota**](https://www.linkedin.com/in/eisukeh/), [**Alexander Gao**](https://www.alexandergao.com/), [**Lerrel Pinto**](https://www.lerrelpinto.com/), [**Ludovic Righetti**](https://wp.nyu.edu/machinesinmotion/89-2/), [**Chen Feng**](https://engineering.nyu.edu/faculty/chen-feng), \n\n## Abstract\nWe propose to study a new learning task, mobile construction, to enable an agent to build designed structures in 1/2/3D grid worlds while navigating in the same evolving environments. Unlike existing robot learning tasks such as visual navigation and object manipulation, this task is challenging because of the interdependence between accurate localization and strategic construction planning. In pursuit of generic and adaptive solutions to this partially observable Markov decision process (POMDP) based on deep reinforcement learning (RL), we design a Deep Recurrent Q-Network (DRQN) with explicit recurrent position estimation in this dynamic grid world. Our extensive experiments show that pre-training this position estimation module before Q-learning can significantly improve the construction performance measured by the intersection-over-union score, achieving the best results in our benchmark of various baselines including model-free and model-based RL, a handcrafted SLAM-based policy, and human players.\n\n# Installation\n\nWe recommend user to create a virtual environment for running this project. We list details of the environment setup process as follows:\n\n## Create and activate new conda env.\n\n```\nconda create -n my-conda-env python=3.7\nconda activate my-conda-env\n```\n\n## Note: pytorch is needed, so you need to install it based on your own system conditions. Here we use Linux and CUDA version 11.7 as an example.\n\n```pip3 install torch torchvision torchaudio```\n\n## Next, install other dependencies listed in requirement.txt\n\n```pip install -r requirements.txt```\n\n## How to use\n\nOur environment is developed based on the [OpenAi Gym](https://gym.openai.com/). You can simply follow the similar way to use our environment. Here we present an example for using 1D static task environment.\n\n```\nfrom DMP_Env_1D_static import deep_mobile_printing_1d1r ### you may need to find the path to this environment in [Env] folder \nenv = deep_mobile_printing_1d1r(plan_choose=2) ### plan_choose could be 0: sin, 1: Gaussian, and 2: Step curve  \nobservation = env.reset()\nfig = plt.figure(figsize=(5, 5))\nax = fig.add_subplot(1, 1, 1)\nax.clear()\nfor _ in range(1000):\n  action = np.random.randint(env.action_dim) # your agent here (this takes random actions)\n  observation, reward, done = env.step(action)\n  env.render(ax)\n  plt.pause(0.1)\n  if done:\n    break\nplt.show()\n```\n\n# Reproduce experiment results\n\nAll scripts for each method are in script/ folder where subfolder contains policies for 1D, 2D, and 3D tasks. You can find all hyperparameters used for each case in the config/ folder which has the same structure as script/ folder. The scripts for simulation environments are in Env/ folder. You can easily reproduce the experiments by running the algorithm scripts with its corresponding hyperparameters in the YML files. For example, if I want to train the DQN policy on 2D variable dense task:\n\n```\ncd script/DQN/2d/\npython DQN_2d_dynamic.py ../../../config/DQN/2D/dynamic_dense.yml\n```\n\n# Multiprocess\n\nWe also provide a multiprocess script for batch simulation. \n\n```\npython multiprocess.py --env 1DStatic --plan_type 0 --num_envs 5\n```\n\n## [Paper (OpenReview)](https://openreview.net/forum?id=NEtep2C7yD)\nTo cite our paper:\n```\n@inproceedings{\n    anonymous2023learning,\n    title={Learning Simultaneous Navigation and Construction in Grid Worlds},\n    author={Anonymous},\n    booktitle={Submitted to The Eleventh International Conference on Learning Representations },\n    year={2023},\n    url={https://openreview.net/forum?id=NEtep2C7yD},\n    note={under review}\n}\n```\n\n## Acknowledgment\n The research is supported by NSF CPS program under CMMI-1932187. The authors gratefully thank our human test participants and the helpful comments from [**Bolei Zhou**](http://bzhou.ie.cuhk.edu.hk/), [**Zhen Liu**](http://itszhen.com/), and the anonymous reviewers, and also [**Congcong Wen**](https://scholar.google.com/citations?hl=en\u0026user=OTBgvCYAAAAJ) for paper revision.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fai4ce%2Fsnac","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fai4ce%2Fsnac","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fai4ce%2Fsnac/lists"}