{"id":28061260,"url":"https://github.com/stanfordvl/gibsonenv","last_synced_at":"2025-05-12T09:55:14.596Z","repository":{"id":49402362,"uuid":"114683429","full_name":"StanfordVL/GibsonEnv","owner":"StanfordVL","description":"Gibson Environments: Real-World Perception for Embodied Agents","archived":false,"fork":false,"pushed_at":"2024-04-15T12:20:11.000Z","size":81903,"stargazers_count":896,"open_issues_count":49,"forks_count":146,"subscribers_count":29,"default_branch":"master","last_synced_at":"2025-04-01T14:39:17.716Z","etag":null,"topics":["computer-vision","cvpr2018","deep-learning","deep-reinforcement-learning","reinforcement-learning","research","robotics","ros","sim2real","simulator"],"latest_commit_sha":null,"homepage":"http://gibsonenv.stanford.edu/","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/StanfordVL.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-12-18T20:12:35.000Z","updated_at":"2025-03-21T05:53:13.000Z","dependencies_parsed_at":"2024-06-20T13:09:20.899Z","dependency_job_id":null,"html_url":"https://github.com/StanfordVL/GibsonEnv","commit_stats":{"total_commits":907,"total_committers":10,"mean_commits":90.7,"dds":0.5292171995589856,"last_synced_commit":"f474d9efd5b5ef703e3bf630a6f7448b54875d0c"},"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/StanfordVL%2FGibsonEnv","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/StanfordVL%2FGibsonEnv/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/StanfordVL%2FGibsonEnv/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/StanfordVL%2FGibsonEnv/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/StanfordVL","download_url":"https://codeload.github.com/StanfordVL/GibsonEnv/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253712002,"owners_count":21951683,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","cvpr2018","deep-learning","deep-reinforcement-learning","reinforcement-learning","research","robotics","ros","sim2real","simulator"],"created_at":"2025-05-12T09:55:13.936Z","updated_at":"2025-05-12T09:55:14.571Z","avatar_url":"https://github.com/StanfordVL.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# GIBSON ENVIRONMENT for Embodied Active Agents with Real-World Perception \n\nYou shouldn't play video games all day, so shouldn't your AI! We built a virtual environment simulator, Gibson, that offers real-world experience for learning perception.  \n\n\u003cimg src=misc/ui.gif width=\"600\"\u003e\n \n**Summary**: Perception and being active (i.e. having a certain level of motion freedom) are closely tied. Learning active perception and sensorimotor control in the physical world is cumbersome as existing algorithms are too slow to efficiently learn in real-time and robots are fragile and costly. This has given a fruitful rise to learning in the simulation which consequently casts a question on transferring to real-world. We developed Gibson environment with the following primary characteristics:  \n\n**I.** being from the real-world and reflecting its semantic complexity through virtualizing real spaces,  \n**II.** having a baked-in mechanism for transferring to real-world (Goggles function), and  \n**III.** embodiment of the agent and making it subject to constraints of space and physics via integrating a physics engine ([Bulletphysics](http://bulletphysics.org/wordpress/)).  \n\n**Naming**: Gibson environment is named after *James J. Gibson*, the author of \"Ecological Approach to Visual Perception\", 1979. “We must perceive in order to move, but we must also move in order to perceive” – JJ Gibson\n\nPlease see the [website](http://gibson.vision/) (http://gibsonenv.stanford.edu/) for more technical details. This repository is intended for distribution of the environment and installation/running instructions.\n\n#### Paper\n**[\"Gibson Env: Real-World Perception for Embodied Agents\"](http://gibson.vision/)**, in **CVPR 2018 [Spotlight Oral]**.\n\n\n[![Gibson summary video](misc/vid_thumbnail_600.png)](https://youtu.be/KdxuZjemyjc \"Click to watch the video summarizing Gibson environment!\")\n\n\n\nRelease\n=================\n**This is the 0.3.1 release. Bug reports, suggestions for improvement, as well as community developments are encouraged and appreciated.** [change log file](misc/CHANGELOG.md).  \n\n\nDatabase\n=================\nThe full database includes 572 spaces and 1440 floors and can be downloaded [here](gibson/data/README.md). A diverse set of visualizations of all spaces in Gibson can be seen [here](http://gibsonenv.stanford.edu/database/). To make the core assets download package lighter for the users, we  include a small subset (39) of the spaces. Users can download the rest of the spaces and add them to the assets folder. We also integrated [Stanford 2D3DS](http://3dsemantics.stanford.edu/) and [Matterport 3D](https://niessner.github.io/Matterport/) as separate datasets if one wishes to use Gibson's simulator with those datasets (access [here](gibson/data/README.md)).\n\nTable of contents\n=================\n\n   * [Installation](#installation)\n        * [Quick Installation (docker)](#a-quick-installation-docker)\n        * [Building from source](#b-building-from-source)\n        * [Uninstalling](#uninstalling)\n   * [Quick Start](#quick-start)\n        * [Gibson FPS](#gibson-framerate)\n        * [Web User Interface](#web-user-interface)\n        * [Rendering Semantics](#rendering-semantics)\n        * [Robotic Agents](#robotic-agents)\n        * [ROS Configuration](#ros-configuration)\n   * [Coding your RL agent](#coding-your-rl-agent)\n   * [Environment Configuration](#environment-configuration)\n   * [Goggles: transferring the agent to real-world](#goggles-transferring-the-agent-to-real-world)\n   * [Citation](#citation)\n\n\n\nInstallation\n=================\n\n#### Installation Method\n\nThere are two ways to install gibson, A. using our docker image (recommended) and B. building from source. \n\n#### System requirements\n\nThe minimum system requirements are the following:\n\nFor docker installation (A): \n- Ubuntu 16.04\n- Nvidia GPU with VRAM \u003e 6.0GB\n- Nvidia driver \u003e= 384\n- CUDA \u003e= 9.0, CuDNN \u003e= v7\n\nFor building from the source(B):\n- Ubuntu \u003e= 14.04\n- Nvidia GPU with VRAM \u003e 6.0GB\n- Nvidia driver \u003e= 375\n- CUDA \u003e= 8.0, CuDNN \u003e= v5\n\n#### Download data\n\nFirst, our environment core assets data are available [here](https://storage.googleapis.com/gibson_scenes/assets_core_v2.tar.gz). You can follow the installation guide below to download and set up them properly. `gibson/assets` folder stores necessary data (agent models, environments, etc) to run gibson environment. Users can add more environments files into `gibson/assets/dataset` to run gibson on more environments. Visit the [database readme](gibson/data/README.md) for downloading more spaces. Please sign the [license agreement](gibson/data/README.md#download) before using Gibson's database.\n\n\nA. Quick installation (docker)\n-----\n\nWe use docker to distribute our software, you need to install [docker](https://docs.docker.com/engine/installation/) and [nvidia-docker2.0](https://github.com/nvidia/nvidia-docker/wiki/Installation-(version-2.0)) first. \n\nRun `docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi` to verify your installation. \n\nYou can either 1. pull from our docker image (recommended) or 2. build your own docker image.\n\n\n1. Pull from our docker image (recommended)\n\n```bash\n# download the dataset from https://storage.googleapis.com/gibson_scenes/dataset.tar.gz\ndocker pull xf1280/gibson:0.3.1\nxhost +local:root\ndocker run --runtime=nvidia -ti --rm -e DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -v \u003chost path to dataset folder\u003e:/root/mount/gibson/gibson/assets/dataset xf1280/gibson:0.3.1\n```\n\n2. Build your own docker image \n```bash\ngit clone https://github.com/StanfordVL/GibsonEnv.git\ncd GibsonEnv\n./download.sh # this script downloads assets data file and decompress it into gibson/assets folder\ndocker build . -t gibson ### finish building inside docker, note by default, dataset will not be included in the docker images\nxhost +local:root ## enable display from docker\n```\nIf the installation is successful, you should be able to run `docker run --runtime=nvidia -ti --rm -e DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -v \u003chost path to dataset folder\u003e:/root/mount/gibson/gibson/assets/dataset gibson` to create a container. Note that we don't include\ndataset files in docker image to keep our image slim, so you will need to mount it to the container when you start a container. \n\n#### Notes on deployment on a headless server\n\nGibson Env supports deployment on a headless server and remote access with `x11vnc`. \nYou can build your own docker image with the docker file `Dockerfile` as above.\nInstructions to run gibson on a headless server (requires X server running):\n\n1. Install nvidia-docker2 dependencies following the starter guide. Install `x11vnc` with `sudo apt-get install x11vnc`.\n2. Have xserver running on your host machine, and run `x11vnc` on DISPLAY :0.\n3. `docker run --runtime=nvidia -ti --rm -e DISPLAY -v /tmp/.X11-unix/X0:/tmp/.X11-unix/X0 -v \u003chost path to dataset folder\u003e:/root/mount/gibson/gibson/assets/dataset \u003cgibson image name\u003e`\n4. Run gibson with `python \u003cgibson example or training\u003e` inside docker.\n5. Visit your `host:5900` and you should be able to see the GUI.\n\nIf you don't have X server running, you can still run gibson, see [this guide](https://github.com/StanfordVL/GibsonEnv/wiki/Running-GibsonEnv-on-headless-server) for more details.\n\nB. Building from source\n-----\nIf you don't want to use our docker image, you can also install gibson locally. This will require some dependencies to be installed. \n\nFirst, make sure you have Nvidia driver and CUDA installed. If you install from source, CUDA 9 is not necessary, as that is for nvidia-docker 2.0. Then, let's install some dependencies:\n\n```bash\napt-get update \napt-get install libglew-dev libglm-dev libassimp-dev xorg-dev libglu1-mesa-dev libboost-dev \\\n\t\tmesa-common-dev freeglut3-dev libopenmpi-dev cmake golang libjpeg-turbo8-dev wmctrl \\\n\t\txdotool libzmq3-dev zlib1g-dev\n```\t\n\nInstall required deep learning libraries: Using python3.5 is recommended. You can create a python3.5 environment first. \n\n```bash\nconda create -n py35 python=3.5 anaconda \nsource activate py35 # the rest of the steps needs to be performed in the conda environment\nconda install -c conda-forge opencv\npip install http://download.pytorch.org/whl/cu90/torch-0.3.1-cp35-cp35m-linux_x86_64.whl \npip install torchvision==0.2.0\npip install tensorflow==1.3\n```\nClone the repository, download data and build\n```bash\ngit clone https://github.com/StanfordVL/GibsonEnv.git\ncd GibsonEnv\n./download.sh # this script downloads assets data file and decompress it into gibson/assets folder\n./build.sh build_local ### build C++ and CUDA files\npip install -e . ### Install python libraries\n```\n\nInstall OpenAI baselines if you need to run the training demo.\n\n```bash\ngit clone https://github.com/fxia22/baselines.git\npip install -e baselines\n```\n\nUninstalling\n----\n\nUninstall gibson is easy. If you installed with docker, just run `docker images -a | grep \"gibson\" | awk '{print $3}' | xargs docker rmi` to clean up the image. If you installed from source, uninstall with `pip uninstall gibson`\n\n\nQuick Start\n=================\n\nFirst run `xhost +local:root` on your host machine to enable display. You may need to run `export DISPLAY=:0` first. After getting into the docker container with `docker run --runtime=nvidia -ti --rm -e DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -v \u003chost path to dataset folder\u003e:/root/mount/gibson/gibson/assets/dataset gibson`, you will get an interactive shell. Now you can run a few demos. \n\nIf you installed from source, you can run those directly using the following commands without using docker. \n\n\n```bash\npython examples/demo/play_husky_nonviz.py ### Use ASWD keys on your keyboard to control a car to navigate around Gates building\n```\n\n\u003cimg src=misc/husky_nonviz.png width=\"600\"\u003e\n\nYou will be able to use ASWD keys on your keyboard to control a car to navigate around Gates building. A camera output will not be shown in this particular demo. \n\n```bash\npython examples/demo/play_husky_camera.py ### Use ASWD keys on your keyboard to control a car to navigate around Gates building, while RGB and depth camera outputs are also shown.\n```\n\u003cimg src=misc/husky_camera.png width=\"600\"\u003e\n\nYou will able to use ASWD keys on your keyboard to control a car to navigate around Gates building. You will also be able to see the RGB and depth camera outputs. \n\n```bash\npython examples/train/train_husky_navigate_ppo2.py ### Use PPO2 to train a car to navigate down the hallway in Gates building, using visual input from the camera.\n```\n\n\u003cimg src=misc/husky_train.png width=\"800\"\u003e\nBy running this command you will start training a husky robot to navigate in Gates building and go down the corridor with RGBD input. You will see some RL related statistics in the terminal after each episode.\n\n\n```bash\npython examples/train/train_ant_navigate_ppo1.py ### Use PPO1 to train an ant to navigate down the hallway in Gates building, using visual input from the camera.\n```\n\n\u003cimg src=misc/ant_train.png width=\"800\"\u003e\nBy running this command you will start training an ant to navigate in Gates building and go down the corridor with RGBD input. You will see some RL related statistics in the terminal after each episode.\n\n\n\nGibson Framerate\n----\nBelow is Gibson Environment's framerate benchmarked on different platforms. Please refer to [fps branch](https://github.com/StanfordVL/GibsonEnv/tree/fps) for the code to reproduce the results.\n\u003ctable class=\"table\"\u003e\n  \u003ctr\u003e\n    \u003cth scope=\"row\"\u003ePlatform\u003c/th\u003e\n    \u003ctd colspan=\"3\"\u003eTested on Intel E5-2697 v4 + NVIDIA Tesla V100\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003cth scope=\"col\"\u003eResolution [nxn]\u003c/th\u003e\n    \u003cth scope=\"col\"\u003e128\u003c/th\u003e\n    \u003cth scope=\"col\"\u003e256\u003c/th\u003e\n    \u003cth scope=\"col\"\u003e512\u003c/th\u003e\n \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003cth scope=\"row\"\u003eRGBD, pre network\u003ccode\u003ef\u003c/code\u003e\u003c/th\u003e\n    \u003ctd\u003e109.1\u003c/td\u003e\n    \u003ctd\u003e58.5\u003c/td\u003e\n    \u003ctd\u003e26.5\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003cth scope=\"row\"\u003eRGBD, post network\u003ccode\u003ef\u003c/code\u003e\u003c/th\u003e\n    \u003ctd\u003e77.7\u003c/td\u003e\n    \u003ctd\u003e30.6\u003c/td\u003e\n    \u003ctd\u003e14.5\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003cth scope=\"row\"\u003eRGBD, post small network\u003ccode\u003ef\u003c/code\u003e\u003c/th\u003e\n    \u003ctd\u003e87.4\u003c/td\u003e\n    \u003ctd\u003e40.5\u003c/td\u003e\n    \u003ctd\u003e21.2\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003cth scope=\"row\"\u003eDepth only\u003c/th\u003e\n    \u003ctd\u003e253.0\u003c/td\u003e\n    \u003ctd\u003e197.9\u003c/td\u003e\n    \u003ctd\u003e124.7\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003cth scope=\"row\"\u003eSurface Normal only\u003c/th\u003e\n    \u003ctd\u003e207.7\u003c/td\u003e\n    \u003ctd\u003e129.7\u003c/td\u003e\n    \u003ctd\u003e57.2\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003cth scope=\"row\"\u003eSemantic only\u003c/th\u003e\n    \u003ctd\u003e190.0\u003c/td\u003e\n    \u003ctd\u003e144.2\u003c/td\u003e\n    \u003ctd\u003e55.6\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003cth scope=\"row\"\u003eNon-Visual Sensory\u003c/th\u003e\n    \u003ctd\u003e396.1\u003c/td\u003e\n    \u003ctd\u003e396.1\u003c/td\u003e\n    \u003ctd\u003e396.1\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\nWe also tested on \u003ccode\u003eIntel I7 7700 + NVIDIA GeForce GTX 1070Ti\u003c/code\u003e and \u003ccode\u003eTested on Intel I7 6580k + NVIDIA GTX 1080Ti\u003c/code\u003e platforms. The FPS difference is within 10% on each task.\n\n\u003ctable class=\"table\"\u003e\n    \u003ctr\u003e\n        \u003cth scope=\"row\"\u003ePlatform\u003c/th\u003e\n        \u003ctd colspan=\"6\"\u003eMulti-process FPS tested on Intel E5-2697 v4 + NVIDIA Tesla V100\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003cth scope=\"col\"\u003eConfiguration\u003c/th\u003e\n      \u003cth scope=\"col\"\u003e512x512 episode sync\u003c/th\u003e\n      \u003cth scope=\"col\"\u003e512x512 frame sync\u003c/th\u003e\n      \u003cth scope=\"col\"\u003e256x256 episode sync\u003c/th\u003e\n      \u003cth scope=\"col\"\u003e256x256 frame sync\u003c/th\u003e\n      \u003cth scope=\"col\"\u003e128x128 episode sync\u003c/th\u003e\n      \u003cth scope=\"col\"\u003e128x128 frame sync\u003c/th\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003cth scope=\"row\"\u003e1 process\u003c/th\u003e\n      \u003ctd\u003e12.8\u003c/td\u003e\n      \u003ctd\u003e12.02\u003c/td\u003e\n      \u003ctd\u003e32.98\u003c/td\u003e\n      \u003ctd\u003e32.98\u003c/td\u003e\n      \u003ctd\u003e52\u003c/td\u003e\n      \u003ctd\u003e52\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003cth scope=\"row\"\u003e2 processes\u003c/th\u003e\n      \u003ctd\u003e23.4\u003c/td\u003e\n      \u003ctd\u003e20.9\u003c/td\u003e\n      \u003ctd\u003e60.89\u003c/td\u003e\n      \u003ctd\u003e53.63\u003c/td\u003e\n      \u003ctd\u003e86.1\u003c/td\u003e\n      \u003ctd\u003e101.8\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003cth scope=\"row\"\u003e4 processes\u003c/th\u003e\n      \u003ctd\u003e42.4\u003c/td\u003e\n      \u003ctd\u003e31.97\u003c/td\u003e\n      \u003ctd\u003e105.26\u003c/td\u003e\n      \u003ctd\u003e76.23\u003c/td\u003e\n      \u003ctd\u003e97.6\u003c/td\u003e\n      \u003ctd\u003e145.9\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003cth scope=\"row\"\u003e8 processes\u003c/th\u003e\n      \u003ctd\u003e72.5\u003c/td\u003e\n      \u003ctd\u003e48.1\u003c/td\u003e\n      \u003ctd\u003e138.5\u003c/td\u003e\n      \u003ctd\u003e97.72\u003c/td\u003e\n      \u003ctd\u003e113\u003c/td\u003e\n      \u003ctd\u003e151\u003c/td\u003e\n    \u003c/tr\u003e\n\u003c/table\u003e\n\n\u003cimg src=misc/mpi_fps.png width=\"600\"\u003e\n\nWeb User Interface\n----\nWhen running Gibson, you can start a web user interface with `python gibson/utils/web_ui.py python gibson/utils/web_ui.py 5552`. This is helpful when you cannot physically access the machine running gibson or you are running on a headless cloud environment. You need to change `mode` in configuration file to `web_ui` to use the web user interface.\n\n\u003cimg src=misc/web_ui.png width=\"600\"\u003e\n\nRendering Semantics\n----\n\u003cimg src=misc/instance_colorcoding_semantics.png width=\"600\"\u003e\n\nGibson can provide pixel-wise frame-by-frame semantic masks when the model is semantically annotated. As of now we have incorporated models from [Stanford 2D-3D-Semantics Dataset](http://buildingparser.stanford.edu/) and [Matterport 3D](https://niessner.github.io/Matterport/) for this purpose. You can access them within Gibson [here](https://github.com/StanfordVL/GibsonEnv/blob/master/gibson/data/README.md#download-gibson-database-of-spaces). We refer you to the original dataset's reference for the list of their semantic classes and annotations. \n\nFor detailed instructions of rendering semantics in Gibson, see [semantic instructions](gibson/utils/semantics.md). As one example in the starter dataset that comes with installation, `space7` includes Stanford 2D-3D-Semantics style annotation. \n\n\u003c!---\n**Agreement**: If you choose to use the models from [Stanford 2D3DS](http://3dsemantics.stanford.edu/) or [Matterport 3D](https://niessner.github.io/Matterport/) for rendering semantics, please sign their respective license agreements. Stanford 2D3DS's agreement is inclued in Gibson Database's agreement and does not need to be signed again. For Matterport3D, please see [here](https://niessner.github.io/Matterport/).\n---\u003e\n\nRobotic Agents\n----\n\nGibson provides a base set of agents. See videos of these agents and their corresponding perceptual observation [here](http://gibsonenv.stanford.edu/agents/). \n\u003cimg src=misc/agents.gif\u003e\n\nTo enable (optionally) abstracting away low-level control and robot dynamics for high-level tasks, we also provide a set of practical and ideal controllers for each agent.\n\n| Agent Name     | DOF | Information      | Controller |\n|:-------------: | :-------------: |:-------------: |:-------------| \n| Mujoco Ant      | 8   | [OpenAI Link](https://blog.openai.com/roboschool/) | Torque |\n| Mujoco Humanoid | 17  | [OpenAI Link](https://blog.openai.com/roboschool/) | Torque |\n| Husky Robot     | 4   | [ROS](http://wiki.ros.org/Robots/Husky), [Manufacturer](https://www.clearpathrobotics.com/) | Torque, Velocity, Position |\n| Minitaur Robot  | 8   | [Robot Page](https://www.ghostrobotics.io/copy-of-robots), [Manufacturer](https://www.ghostrobotics.io/) | Sine Controller |\n| JackRabbot      | 2   | [Stanford Project Link](http://cvgl.stanford.edu/projects/jackrabbot/) | Torque, Velocity, Position |\n| TurtleBot       | 2   | [ROS](http://wiki.ros.org/Robots/TurtleBot), [Manufacturer](https://www.turtlebot.com/) | Torque, Velocity, Position |\n| Quadrotor         | 6   | [Paper](https://repository.upenn.edu/cgi/viewcontent.cgi?referer=https://www.google.com/\u0026httpsredir=1\u0026article=1705\u0026context=edissertations) | Position |\n\n\n### Starter Code \n\nMore demonstration examples can be found in `examples/demo` folder\n\n| Example        | Explanation          |\n|:-------------: |:-------------| \n|`play_ant_camera.py`|Use 1234567890qwerty keys on your keyboard to control an ant to navigate around Gates building, while RGB and depth camera outputs are also shown. |\n|`play_ant_nonviz.py`| Use 1234567890qwerty keys on your keyboard to control an ant to navigate around Gates building.|\n|`play_drone_camera.py`| Use ASWDZX keys on your keyboard to control a drone to navigate around Gates building, while RGB and depth camera outputs are also shown.|\n|`play_drone_nonviz.py`| Use ASWDZX keys on your keyboard to control a drone to navigate around Gates building|\n|`play_humanoid_camera.py`| Use 1234567890qwertyui keys on your keyboard to control a humanoid to navigate around Gates building. Just kidding, controlling humaniod with keyboard is too difficult, you can only watch it fall. Press R to reset. RGB and depth camera outputs are also shown. |\n|`play_humanoid_nonviz.py`| Watch a humanoid fall. Press R to reset.|\n|`play_husky_camera.py`| Use ASWD keys on your keyboard to control a car to navigate around Gates building, while RGB and depth camera outputs are also shown.|\n|`play_husky_nonviz.py`| Use ASWD keys on your keyboard to control a car to navigate around Gates building|\n\nMore training code can be found in `examples/train` folder.\n\n| Example        | Explanation          |\n|:-------------: |:-------------| \n|`train_husky_navigate_ppo2.py`|   Use PPO2 to train a car to navigate down the hallway in Gates building, using RGBD input from the camera.|\n|`train_husky_navigate_ppo1.py`|   Use PPO1 to train a car to navigate down the hallway in Gates building, using RGBD input from the camera.|\n|`train_ant_navigate_ppo1.py`| Use PPO1 to train an ant to navigate down the hallway in Gates building, using visual input from the camera. |\n|`train_ant_climb_ppo1.py`| Use PPO1 to train an ant to climb down the stairs in Gates building, using visual input from the camera.  |\n|`train_ant_gibson_flagrun_ppo1.py`| Use PPO1 to train an ant to chase a target (a red cube) in Gates building. Everytime the ant gets to target(or time out), the target will change position.|\n|`train_husky_gibson_flagrun_ppo1.py`|Use PPO1 to train a car to chase a target (a red cube) in Gates building. Everytime the car gets to target(or time out), the target will change position. |\n\nROS Configuration\n---------\n\nWe provide examples of configuring Gibson with ROS [here](examples/ros/gibson-ros). We use turtlebot as an example, after a policy is trained in Gibson, it requires minimal changes to deploy onto a turtlebot. See [README](examples/ros/gibson-ros) for more details.\n\n\n\n\nCoding Your RL Agent\n====\nYou can code your RL agent following our convention. The interface with our environment is very simple (see some examples in the end of this section).\n\nFirst, you can create an environment by creating an instance of classes in `gibson/core/envs` folder. \n\n\n```python\nenv = AntNavigateEnv(is_discrete=False, config = config_file)\n```\n\nThen do one step of the simulation with `env.step`. And reset with `env.reset()`\n```python\nobs, rew, env_done, info = env.step(action)\n```\n`obs` gives the observation of the robot. It is a dictionary with each component as a key value pair. Its keys are specified by user inside config file. E.g. `obs['nonviz_sensor']` is proprioceptive sensor data, `obs['rgb_filled']` is rgb camera data.\n\n`rew` is the defined reward. `env_done` marks the end of one episode, for example, when the robot dies. \n`info` gives some additional information of this step; sometimes we use this to pass additional non-visual sensor values.\n\nWe mostly followed [OpenAI gym](https://github.com/openai/gym) convention when designing the interface of RL algorithms and the environment. In order to help users start with the environment quicker, we\nprovide some examples at [examples/train](examples/train). The RL algorithms that we use are from [openAI baselines](https://github.com/openai/baselines) with some adaptation to work with hybrid visual and non-visual sensory data.\nIn particular, we used [PPO](https://github.com/openai/baselines/tree/master/baselines/ppo1) and a speed optimized version of [PPO](https://github.com/openai/baselines/tree/master/baselines/ppo2).\n\n\nEnvironment Configuration\n=================\nEach environment is configured with a `yaml` file. Examples of `yaml` files can be found in `examples/configs` folder. Parameters for the file is explained below. For more informat specific to Bullet Physics engine, you can see the documentation [here](https://docs.google.com/document/d/10sXEhzFRSnvFcl3XxNGhnD4N2SedqwdAvK3dsihxVUA/edit).\n\n| Argument name        | Example value           | Explanation  |\n|:-------------:|:-------------:| :-----|\n| envname      | AntClimbEnv | Environment name, make sure it is the same as the class name of the environment |\n| model_id      | space1-space8      |   Scene id, in beta release, choose from space1-space8 |\n| target_orn | [0, 0, 3.14]      |   Eulerian angle (in radian) target orientation for navigating, the reference frame is world frame. For non-navigation tasks, this parameter is ignored. |\n|target_pos | [-7, 2.6, -1.5] | target position (in meter) for navigating, the reference frame is world frame. For non-navigation tasks, this parameter is ignored. |\n|initial_orn | [0, 0, 3.14] | initial orientation (in radian) for navigating, the reference frame is world frame |\n|initial_pos | [-7, 2.6, 0.5] | initial position (in meter) for navigating, the reference frame is world frame|\n|fov | 1.57  | field of view for the camera, in radian |\n| use_filler | true/false  | use neural network filler or not. It is recommended to leave this argument true. See [Gibson Environment website](http://gibson.vision/) for more information. |\n|display_ui | true/false  | Gibson has two ways of showing visual output, either in multiple windows, or aggregate them into a single pygame window. This argument determines whether to show pygame ui or not, if in a production environment (training), you need to turn this off |\n|show_diagnostics | true/false  | show dignostics(including fps, robot position and orientation, accumulated rewards) overlaying on the RGB image |\n|ui_num |2  | how many ui components to show, this should be length of ui_components. |\n| ui_components | [RGB_FILLED, DEPTH]  | which are the ui components, choose from [RGB_FILLED, DEPTH, NORMAL, SEMANTICS, RGB_PREFILLED] |\n|output | [nonviz_sensor, rgb_filled, depth]  | output of the environment to the robot, choose from  [nonviz_sensor, rgb_filled, depth]. These values are independent of `ui_components`, as `ui_components` determines what to show and `output` determines what the robot receives. |\n|resolution | 512 | choose from [128, 256, 512] resolution of rgb/depth image |\n|initial_orn | [0, 0, 3.14] | initial orientation (in radian) for navigating, the reference frame is world frame |\n|speed : timestep | 0.01 | length of one physics simulation step in seconds(as defined in [Bullet](https://docs.google.com/document/d/10sXEhzFRSnvFcl3XxNGhnD4N2SedqwdAvK3dsihxVUA/edit)). For example, if timestep=0.01 sec, frameskip=10, and the environment is running at 100fps, it will be 10x real time. Note: setting timestep above 0.1 can cause instability in current version of Bullet simulator since an object should not travel faster than its own radius within one timestep. You can keep timestep at a low value but increase frameskip to simulate at a faster speed. See [Bullet guide](https://docs.google.com/document/d/10sXEhzFRSnvFcl3XxNGhnD4N2SedqwdAvK3dsihxVUA/edit) under \"discrete collision detection\" for more info.|\n|speed : frameskip | 10 | how many timestep to skip when rendering frames. See above row for an example. For tasks that does not require high frequency control, you can set frameskip to larger value to gain further speed up. |\n|mode | gui/headless/web_ui  | gui or headless, if in a production environment (training), you need to turn this to headless. In gui mode, there will be visual output; in headless mode, there will be no visual output. In addition to that, if you set mode to web_ui, it will behave like in headless mode but the visual will be rendered to a web UI server. ([more information](#web-user-interface))|\n|verbose |true/false  | show diagnostics in terminal |\n|fast_lq_render| true/false| if there is fast_lq_render in yaml file, Gibson will use a smaller filler network, this will render faster but generate slightly lower quality camera output. This option is useful for training RL agents fast. |\n\n#### Making Your Customized Environment\nGibson provides a set of methods for you to define your own environments. You can follow the existing environments inside `gibson/core/envs`.\n\n| Method name        | Usage           |\n|:------------------:|:---------------------------|\n| robot.render_observation(pose) | Render new observations based on pose, returns a dictionary. |\n| robot.get_observation() | Get observation at current pose. Needs to be called after robot.render_observation(pose). This does not induce extra computation. |\n| robot.get_position() | Get current robot position. |\n| robot.get_orientation() | Get current robot orientation. |\n| robot.eyes.get_position() | Get current robot perceptive camera position. |\n| robot.eyes.get_orientation() | Get current robot perceptive camera orientation. | \n| robot.get_target_position() | Get robot target position. |\n| robot.apply_action(action) | Apply action to robot. |  \n| robot.reset_new_pose(pos, orn) | Reset the robot to any pose. |\n| robot.dist_to_target() | Get current distance from robot to target. |\n\nGoggles: transferring the agent to real-world\n=================\nGibson includes a baked-in domain adaptation mechanism, named Goggles, for when an agent trained in Gibson is going to be deployed in real-world (i.e. operate based on images coming from an onboard camera). The mechanisms is essentially a learned inverse function that alters the frames coming from a real camera to what they would look like if they were rendered via Gibson, and hence, disolve the domain gap. \n\n\u003cimg src=http://gibson.vision/public/img/figure4.jpg width=\"600\"\u003e\n\n\n**More details:** With all the imperfections in point cloud rendering, it has been proven difficult to get completely photo-realistic rendering with neural network fixes. The remaining issues make a domain gap between the synthesized and real images. Therefore, we formulate the rendering problem as forming a joint space ensuring a correspondence between rendered and real images, rather than trying to (unsuccessfully) render images that are identical to real ones. This provides a deterministic pathway for traversing across these domains and hence undoing the gap. We add another network \"u\" for target image (I_t) and define the rendering loss to minimize the distance between f(I_s) and u(I_t), where \"f\" and \"I_s\" represent the filler neural network and point cloud rendering output, respectively (see the loss in above figure). We use the same network structure for f and u. The function u(I) is trained to alter the observation in real-world, I_t, to look like the corresponding I_s and consequently dissolve the gap. We named the u network goggles, as it resembles corrective lenses for the agent for deployment in real-world. Detailed formulation and discussion of the mechanism can be found in the paper. You can download the function u and apply it when you deploy your trained agent in real-world.\n\nIn order to use goggle, you will need preferably a camera with depth sensor, we provide an example [here](examples/ros/gibson-ros/goggle.py) for Kinect. The trained goggle functions are stored in `assets/unfiller_{resolution}.pth`, and each one is paired with one filler function. You need to use the correct one depending on which filler function is used. If you don't have a camera with depth sensor, we also provide an example for RGB only [here](examples/demo/goggle_video.py).\n\n\nCitation\n=================\n\nIf you use Gibson Environment's software or database, please cite:\n```\n@inproceedings{xiazamirhe2018gibsonenv,\n  title={Gibson {Env}: real-world perception for embodied agents},\n  author={Xia, Fei and R. Zamir, Amir and He, Zhi-Yang and Sax, Alexander and Malik, Jitendra and Savarese, Silvio},\n  booktitle={Computer Vision and Pattern Recognition (CVPR), 2018 IEEE Conference on},\n  year={2018},\n  organization={IEEE}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstanfordvl%2Fgibsonenv","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstanfordvl%2Fgibsonenv","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstanfordvl%2Fgibsonenv/lists"}