{"id":21572922,"url":"https://github.com/transic-robot/transic","last_synced_at":"2025-07-16T19:31:35.848Z","repository":{"id":240162243,"uuid":"789200237","full_name":"transic-robot/transic","owner":"transic-robot","description":null,"archived":false,"fork":false,"pushed_at":"2024-09-05T23:00:41.000Z","size":7565,"stargazers_count":63,"open_issues_count":3,"forks_count":3,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-11-24T12:02:34.683Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/transic-robot.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-04-19T23:01:22.000Z","updated_at":"2024-11-21T03:44:33.000Z","dependencies_parsed_at":"2024-11-24T15:04:27.202Z","dependency_job_id":null,"html_url":"https://github.com/transic-robot/transic","commit_stats":null,"previous_names":["transic-robot/transic"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/transic-robot/transic","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/transic-robot%2Ftransic","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/transic-robot%2Ftransic/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/transic-robot%2Ftransic/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/transic-robot%2Ftransic/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/transic-robot","download_url":"https://codeload.github.com/transic-robot/transic/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/transic-robot%2Ftransic/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265534554,"owners_count":23783851,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-24T12:01:17.621Z","updated_at":"2025-07-16T19:31:35.204Z","avatar_url":"https://github.com/transic-robot.png","language":"Python","funding_links":[],"categories":["🎓 Getting Started"],"sub_categories":[],"readme":"# TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction\n\u003cdiv align=\"center\"\u003e\n\n[Yunfan Jiang](https://yunfanj.com/),\n[Chen Wang](https://www.chenwangjeremy.net/),\n[Ruohan Zhang](https://ai.stanford.edu/~zharu/),\n[Jiajun Wu](https://jiajunwu.com/),\n[Li Fei-Fei](https://profiles.stanford.edu/fei-fei-li)\n\n\u003cimg src=\"media/SUSig-red.png\" width=200\u003e\n\n**Conference on Robot Learning (CoRL) 2024**\n\n[[Website]](https://transic-robot.github.io/)\n[[arXiv]](https://arxiv.org/abs/2405.10315)\n[[PDF]](https://transic-robot.github.io/assets/pdf/transic_paper.pdf)\n[[TRANSIC-Envs]](https://github.com/transic-robot/transic-envs)\n[[Model Weights]](https://huggingface.co/transic-robot/models)\n[[Training Data]](https://huggingface.co/datasets/transic-robot/data)\n[[Model Card]](https://huggingface.co/transic-robot/models/blob/main/README.md)\n[[Data Card]](https://huggingface.co/datasets/transic-robot/data/blob/main/README.md)\n\n[![Python Version](https://img.shields.io/badge/Python-3.8-blue.svg)](https://github.com/transic-robot/transic)\n[\u003cimg src=\"https://img.shields.io/badge/Framework-PyTorch-red.svg\"/\u003e](https://pytorch.org/)\n[![GitHub license](https://img.shields.io/github/license/transic-robot/transic)](https://github.com/transic-robot/transic/blob/main/LICENSE)\n\n![](media/method_overview.gif)\n______________________________________________________________________\n\u003c/div\u003e\n\nLearning in simulation and transferring the learned policy to the real world has the potential to enable generalist robots. The key challenge of this approach is to address simulation-to-reality (sim-to-real) gaps. Previous methods often require domain-specific knowledge *a priori*. We argue that a straightforward way to obtain such knowledge is by asking humans to observe and assist robot policy execution in the real world. The robots can then learn from humans to close various sim-to-real gaps. We propose **TRANSIC**, a data-driven approach to enable successful sim-to-real transfer based on a human-in-the-loop framework. **TRANSIC** allows humans to augment simulation policies to overcome various unmodeled sim-to-real gaps holistically through intervention and online correction. Residual policies can be learned from human corrections and integrated with simulation policies for autonomous execution. We show that our approach can achieve successful sim-to-real transfer in complex and contact-rich manipulation tasks such as furniture assembly. Through synergistic integration of policies learned in simulation and from humans, **TRANSIC** is effective as a holistic approach to addressing various, often coexisting sim-to-real gaps. It displays attractive properties such as scaling with human effort.\n\n## Table of Contents\n1. [Installation](#Installation)\n2. [Usage](#usage)\n3. [Acknowledgement](#acknowledgement)\n4. [Check out Our Paper](#check-out-our-paper)\n5. [License](#license)\n\n## Installation\nFirst follow the [instruction](https://github.com/transic-robot/transic-envs/#Installation) to create a virtual environment, install IsaacGym, and install our simulation codebase [TRANSIC-Envs](https://github.com/transic-robot/transic-envs).\n\nNow clone this repo and install it.\n```bash\ngit clone https://github.com/transic-robot/transic\ncd transic\npip3 install -e .\n```\n\nOptionally, if you would like to use our [model checkpoints](https://huggingface.co/transic-robot/models) and [training data](https://huggingface.co/datasets/transic-robot/data), download them from 🤗Hugging Face.\n```bash\ngit clone https://huggingface.co/transic-robot/models transic-models\ngit clone https://huggingface.co/datasets/transic-robot/data transic-data\n```\n\n## Usage\n### Training Teacher Policies\nThe basic syntax to launch teacher policy RL training is\n```bash\npython3 main/rl/train.py task=\u003ctask_name\u003e num_envs=\u003cnum_of_parallel_envs\u003e \\\n  sim_device=cuda:\u003cgpu_id\u003e rl_device=cuda:\u003cgpu_id\u003e graphics_device_id=\u003cgpu_id\u003e\n```\nYou need to replace anything within `\u003c\u003e` with suitable values. For example, you can select `task_name` with one from [here](https://github.com/transic-robot/transic-envs/#overview).\n\n\u003e [!TIP]\n\u003e You may need to tune the number of parallel envs `num_envs=\u003cnum_of_parallel_envs\u003e` depending on your GPU memory to achieve the maximum throughput.\n\n\u003e [!TIP]\n\u003e You may use [wandb](https://wandb.ai/site) to log experiments. To do this, add `wandb_activate=true` to the command and specify your wandb username and project name through `wandb_entity=\u003cyour_wandb_user_name\u003e wandb_project=\u003cyour_wandb_project_name\u003e`.\n\nThe training command will create a folder called `runs/{experiment_name}` under the current directory, where you can find the training config and saved checkpoints.\n\nTo test a checkpoint, run the following command.\n```bash\npython3 main/rl/train.py task=\u003ctask_name\u003e num_envs=\u003cnum_of_parallel_envs\u003e \\\n  test=true checkpoint=\u003cpath_to_your_checkpoint\u003e\n```\n\n\u003e [!TIP]\n\u003e To visualize a trained policy, use either `display=true` or `headless=false`. The first option will pop up an OpenCV window showing the env-level workspace from a frontal view. This doesn't require a physical monitor attached. The second option will open the IsaacGym GUI and you will see all parallel environments. This REQUIRES a physical monitor connected to your workstation.\n\n\u003e [!TIP]\n\u003e You can also log policy rollouts as mp4 videos to your wandb. Simply add `capture_video=true` to the test command.\n\n### Training Student Policies\n#### Prepare the Training Data\nWe use trained teacher policies to generate data for student policies. To do so, simply run the following command.\n```bash\npython3 main/rl/train.py task=\u003ctask_name\u003e num_envs=\u003cnum_of_parallel_envs\u003e \\\n  test=true checkpoint=\u003cpath_to_your_checkpoint\u003e \\\n  save_rollouts=true\n```\nRollouts will be saved in `runs/{experiment_name}/rollouts.hdf5`.\n\n\u003e [!TIP]\n\u003e By default, this will generate 10K successful trajectories. Each trajectory will have a minium length of 20 steps. You can change these behaviors by setting `save_successful_rollouts_only`, `num_rollouts_to_save`, and `min_episode_length`.\n\nWe provide weights for trained RL teachers. To use them, replace `checkpoint` with the suitable path. For example,\n```bash\npython3 main/rl/train.py task=Stabilize \\\n  test=true checkpoint=\u003cpath_to_transic-models/rl/stabilize.pth\u003e \\\n  save_rollouts=true\n```\n\nWe also provide pre-generated data for student distillation. They can be found in the `distillation` folder from our [🤗Hugging Face data repository](https://huggingface.co/datasets/transic-robot/data).\n\n#### Start Training\nThe basic syntax to launch student policy distillation is\n```bash\npython3 main/distillation/train.py task=\u003ctask_name\u003e distillation_student_arch=\u003carch\u003e \\\n  bs=\u003cbatch_size\u003e num_envs=\u003cnum_of_parallel_envs\u003e exp_root_dir=\u003cwhere_to_log_experiment\u003e \\\n  data_path=\u003cpath_to_hdf5_file\u003e matched_scene_data_path=\u003cpath_to_matched_scene_data\u003e \\\n  sim_device=cuda:\u003cgpu_id\u003e rl_device=cuda:\u003cgpu_id\u003e graphics_device_id=\u003cgpu_id\u003e gpus=\\[\u003cgpus\u003e\\] \\\n  wandb_project=\u003cyour_wandb_project_name\u003e\n```\nSimilarly, you need to replace anything within `\u003c\u003e` with suitable values. For example, you can select `task_name` with one from [here](https://github.com/transic-robot/transic-envs/#overview). But make sure they have the `PCD` suffix since you are training student policies with visual observations. You can select either `pointnet` or `rnn_pointnet` for policy architecture. You may need to tune the batch size `bs` and number of parallel environments `num_envs` to fit into your GPU. The `exp_root_dir` specifies where you would like to log the experiment. The `data_path` is where your generated rollouts are saved. The `matched_scene_data_path` is a static and fixed dataset we used to regularize the point cloud encoder. It can be found as `distillation/matched_point_cloud_scesim_device=cuda:nes.h5` from our [🤗Hugging Face data repository](https://huggingface.co/datasets/transic-robot/data).\n\n\u003e [!WARNING]\n\u003e By default we add data randomization during the distillation. You may opt to set `module.enable_pcd_augmentation=false` to turn off point cloud augmentation and `module.enable_prop_augmentation=false` to turn off proprioception augmentation. But this will lead to suboptimal student policies that are not robust enough for sim-to-real transfer.\n\n\u003e [!TIP]\n\u003e The argument `gpus` specifies the devices to use for distillation and follows the same [syntax](https://lightning.ai/docs/pytorch/stable/common/trainer.html#devices) as in PyTorch Lightning. Other device-related arguments such as `sim_device`, `rl_device`, and `graphics_device` control which GPU should IsaacGym use. GPUs for distillation and simulation do not need to be the same. Actually, we also support multi-GPU distillation with IsaacGym running on another GPU for evaluation.\n\nThe experiment will be logged at `exp_root_dir`, where you can find the saved config, logs, tensorboard, and checkpoints. Since we periodically switch between training and simulation evaluation. Policies are saved based on their success rates. You can find weights of our student policies in the folder `student` from our [🤗Hugging Face model repository](https://huggingface.co/transic-robot/models).\n\nTo test and visualize trained student policies, run the following command.\n```bash\npython3 main/distillation/test.py task=\u003ctask_name\u003e distillation_student_arch=\u003carch\u003e \\\n  bs=null num_envs=\u003cnum_of_parallel_envs\u003e exp_root_dir=\u003cwhere_to_log_experiment\u003e \\\n  data_path=null matched_scene_data_path=null \\\n  test.ckpt_path=\u003cpath_to_student_policy\u003e display=true\n```\n\n### Correction Data Collection\nOnce we have the simulation base policy, we deploy it on a real robot while a human operator monitors its execution. The human operator intervenes the policy execution when necessary and provides correction through teleoperation. To collect such correction data, checkout the script\n```bash\npython3 main/correction_data_collection.py \\\n  --base-policy-ckpt-path \u003cpath_to_simulation_base_policy_ckpt\u003e \\\n  --data-save-path \u003cwhere_to_save_correction_data\u003e\n```\nWe notice that the real-world observation pipeline and real robot controller may differ across different groups. Therefore, you have to fill in the instantiation of these two components in the script. In our case, we use [`deoxys`](https://github.com/UT-Austin-RPL/deoxys_control) as our robot controller. We provide an example of observation pipeline [here](transic/real_world/obs.py).\n\nWe provide correction data we collected during the project in the `correction_data` folder from our [🤗Hugging Face data repository](https://huggingface.co/datasets/transic-robot/data).\n\n### Training Residual Policies\nOnce we have enough correction data, we can train residual policies with two steps. First, we only learn the residual action head.\n```bash\npython3 main/residual/train.py residual_policy_arch=\u003carch\u003e \\\n  data_dir=\u003ccorrection_data_path\u003e exp_root_dir=\u003cwhere_to_log_experiment\u003e \\\n  residual_policy_task=\u003ctask\u003e \\\n  gpus=\u003cgpus\u003e bs=\u003cbatch_size\u003e \\\n  module.intervention_pred_loss_weight=0.0 \\\n  wandb_project=\u003cyour_wandb_project_name\u003e\n```\nFor `residual_policy_task`, use `insert` for the task Insert and `default` for others.\n\nWe then freeze everything and only learn the head to predict intervention or not.\n```bash\npython3 main/residual/train.py residual_policy_arch=\u003carch\u003e \\\n  data_dir=\u003ccorrection_data_path\u003e exp_root_dir=\u003cwhere_to_log_experiment\u003e \\\n  residual_policy_task=\u003ctask\u003e \\\n  gpus=\u003cgpus\u003e bs=\u003cbatch_size\u003e \\\n  module.residual_policy.update_intervention_head_only=True \\\n  module.residual_policy.ckpt_path_if_update_intervention_head_only=\u003cpath_to_ckpt_from_the_first_step\u003e\n  wandb_project=\u003cyour_wandb_project_name\u003e\n```\n\n\u003e [!NOTE]\n\u003e Residual policies also can be trained in a single step where both the action and intervention prediction heads are jointly learned. We found that the two-step method leads to overall better residual policies.\n\nOur trained residual policies can be found in the folder `residual` from our [🤗Hugging Face model repository](https://huggingface.co/transic-robot/models).\n\n### Integrated Deployment\nOnce we have both the simulation base policy and the residual policy, we can integrate them together for successful sim-to-real transfer. Checkout the script\n```bash\npython3 main/integrated_deployment.py \\\n  --base-policy-ckpt-path \u003cpath_to_simulation_base_policy_ckpt\u003e \\\n  --residual-policy-ckpt-path \u003cpath_to_residual_policy_ckpt\u003e\n```\nSimilarly, you need to fill in the instantiation for real-world observation pipeline and the real-robot controller.\n\n## Acknowledgement\nWe would like to acknowledge the following open-source project that greatly inspired our development.\n- [deoxys](https://github.com/UT-Austin-RPL/deoxys_control)\n\n## Check out Our Paper\nOur paper is posted on [arXiv](https://arxiv.org/abs/2405.10315). If you find our work useful, please consider citing us! \n\n```bibtex\n@inproceedings{jiang2024transic,\n  title     = {TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction},\n  author    = {Yunfan Jiang and Chen Wang and Ruohan Zhang and Jiajun Wu and Li Fei-Fei},\n  booktitle = {Conference on Robot Learning},\n  year      = {2024}\n}\n```\n\n## License\nThis codebase is released under the [MIT License](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftransic-robot%2Ftransic","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftransic-robot%2Ftransic","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftransic-robot%2Ftransic/lists"}