{"id":20731485,"url":"https://github.com/opendrivelab/clover","last_synced_at":"2025-04-06T08:11:38.634Z","repository":{"id":257284167,"uuid":"851540600","full_name":"OpenDriveLab/CLOVER","owner":"OpenDriveLab","description":"[NeurIPS 2024] CLOVER: Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation","archived":false,"fork":false,"pushed_at":"2024-12-05T08:06:48.000Z","size":14248,"stargazers_count":108,"open_issues_count":1,"forks_count":6,"subscribers_count":7,"default_branch":"main","last_synced_at":"2025-04-06T03:38:45.310Z","etag":null,"topics":["closed-loop-control","generative-model","robot-manipulation","visuomotor-control"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/OpenDriveLab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":["OpenDriveLab"],"patreon":null,"open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"otechie":null,"lfx_crowdfunding":null,"custom":null}},"created_at":"2024-09-03T09:29:33.000Z","updated_at":"2025-04-04T03:34:25.000Z","dependencies_parsed_at":"2024-09-15T18:51:24.489Z","dependency_job_id":"321d1b61-6d15-46ad-ac1d-11e83bcff2a0","html_url":"https://github.com/OpenDriveLab/CLOVER","commit_stats":null,"previous_names":["opendrivelab/clover"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenDriveLab%2FCLOVER","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenDriveLab%2FCLOVER/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenDriveLab%2FCLOVER/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenDriveLab%2FCLOVER/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/OpenDriveLab","download_url":"https://codeload.github.com/OpenDriveLab/CLOVER/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247451652,"owners_count":20940939,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["closed-loop-control","generative-model","robot-manipulation","visuomotor-control"],"created_at":"2024-11-17T05:15:00.002Z","updated_at":"2025-04-06T08:11:38.616Z","avatar_url":"https://github.com/OpenDriveLab.png","language":"Python","funding_links":["https://github.com/sponsors/OpenDriveLab"],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=\"left\"\u003e :four_leaf_clover: CLOVER \u003c/h1\u003e \n\nThe official implementation of our **NeurIPS 2024** paper: \\\n**Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation**\n\u003cdiv id=\"top\" align=\"center\"\u003e\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"assets/clover_teaser.png\" width=\"1000px\" \u003e\n\u003c/p\u003e\n\u003c/div\u003e\n\n\u003e [Qingwen Bu](https://scholar.google.com/citations?user=-JCRysgAAAAJ\u0026hl=zh-CN\u0026oi=ao), [Jia Zeng](https://scholar.google.com/citations?hl=zh-CN\u0026user=kYrUfMoAAAAJ), [Li Chen](https://scholar.google.com/citations?user=ulZxvY0AAAAJ\u0026hl=zh-CN), Yanchao Yang, Guyue Zhou, Junchi Yan, Ping Luo, Heming Cui, Yi Ma and Hongyang Li\n\n\u003e 📜 Preprint: \u003ca href=\"https://arxiv.org/abs/2409.09016\"\u003e\u003cimg src=\"https://img.shields.io/badge/arXiv-Paper-\u003ccolor\u003e\"\u003e\u003c/a\u003e :pushpin: Poster: \u003ca href=\"https://docs.google.com/presentation/d/1C0YEx6KPV1s0paW6XMeL7oANujvsLHsyVko-lEBP9c0/edit?usp=sharing\"\u003e\u003cimg src=\"https://img.shields.io/badge/Google%20Drive-4285F4?logo=googledrive\u0026logoColor=fff\"\u003e\u003c/a\u003e\n\n\u003e :mailbox_with_mail: If you have any questions, please feel free to contact: *Qingwen Bu* ( qwbu01@sjtu.edu.cn )\n\nFull code and checkpoints release is coming soon. Please stay tuned.🦾\n\n## :fire: Highlight\n\n* :four_leaf_clover: ​**CLOVER**  employs a text-conditioned video diffusion model for generating visual plans as reference inputs, then these sub-goals guide the feedback-driven policy to generate actions with an error measurement strategy.\n\n\u003cdiv id=\"top\" align=\"center\"\u003e\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"assets/closed-loop.jpg\" width=\"900px\" \u003e\n\u003c/p\u003e\n\u003c/div\u003e\n\n* Owing to the closed-loop attribute, ​**CLOVER** is robust to visual distraction and object variation:\n\u003cdiv id=\"top\" align=\"center\"\u003e\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"assets/vis_robustness.jpg\" width=\"900px\" \u003e\n\u003c/p\u003e\n\u003c/div\u003e\n\n* This closed-loop mechanism enables achieving the desired states accurately and reliably, thereby facilitating the execution of long-term tasks:\n\u003cdiv id=\"top\" align=\"center\"\u003e\n\u003cp align=\"center\"\u003e\n\u003ctd\u003e\u003cvideo src=\"https://github.com/user-attachments/assets/af8af7fa-98e4-48bc-a9e3-eb8af9cd7348\" autoplay width=\"800px\"\u003e\u003c/td\u003e\n\u003c/p\u003e\n\u003c/div\u003e\n\n\n\n\n\n## :loudspeaker: News\n\n- **[2024/09/16]** We released our paper on [arXiv](https://arxiv.org/abs/2409.09016).\n- **[2024/12/01]** We have open sourced the entire codebase and will keep it updated, please give it a try!\n\n## :pushpin: TODO list\n\n- [x] Training script for visual planner\n- [x] Checkpoints release (*Scheduled Release Date*: **Mid-October, 2024**)\n- [x] Evaluation codes on CALVIN (*Scheduled Release Date*: **Mid-October, 2024**)\n- [x] Policy training codes on CALVIN (*Estimated Release Period*: **November, 2024**)\n\n\n\n## :video_game: Getting started \u003ca name=\"installation\"\u003e\u003c/a\u003e\n\nOur training are conducted with **PyTorch 1.13.1**, **CUDA 11.7**, **Ubuntu 22.04**, and **NVIDIA Tesla A100 (80 GB)**. The closed-loop evaluation on CALVIN is run on a system with **NVIDIA RTX 3090**.\n\nWe did further testing with **PyTorch 2.2.0 + CUDA 11.8**, and the training also goes fine.\n\n1. (Optional) We use conda to manage the environment.\n\n```bash\nconda create -n clover python=3.8\nconda activate clover\n```\n\n2. Install dependencies.\n\n```bash\npip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117\npip install git+https://github.com/hassony2/torch_videovision\npip install -e .\n```\n\n3. Installation of CALVIN simulator.\n\n```bash\ngit clone --recurse-submodules https://github.com/mees/calvin.git\nexport CALVIN_ROOT=$(pwd)/calvin\ncd $CALVIN_ROOT\nsh install.sh\n```\n\n## :cd: Checkpoints\n\nWe release model weights of our **Visual Planner** and **Feedback-driven Policy** at [HuggingFace](https://huggingface.co/qwbu/CLOVER).\n\n## Training of Visual Planner \u003ca name=\"TrainingVP\"\u003e\u003c/a\u003e\n\n- ### Requirement\n\n  The visual planner requires **24 GB** GPU VRAM with a batch size of 4 (per GPU), video length of 8 and image size of 128.\n\n- ### Preparation\n\n  * We use [OpenAI-CLIP](https://huggingface.co/openai/clip-vit-large-patch14) to encode task instructions for conditioning.\n\n- ### Initiate training of the visual planner (video diffusion model) on CALVIN\n\n  \u003e Please modify **accelerate_cfg.yaml** first according to your setup.\n\n```bash\naccelerate launch --config_file accelerate_cfg.yaml train.py \\\n    --learning_rate 1e-4 \\\n    --train_num_steps 300000 \\\n    --save_and_sample_every 10000 \\\n    --train_batch_size 32 \\\n    --sample_per_seq 8 \\\n    --sampling_step 5 \\\n    --with_text_conditioning \\\n    --diffusion_steps 100 \\\n    --sample_steps 10 \\\n    --with_depth \\\n    --flow_reg \\\n    --results_folder *path_to_save_your_ckpts*\n```\n\n## Training of Feedback Policy \u003ca name=\"TrainingFP\"\u003e\u003c/a\u003e\n\n- ### Preparation\n\n  * We only support VC-1 as visual encoder for now, please setup environments and download pre-trained checkpoints according to [eai-vc](https://github.com/facebookresearch/eai-vc)\n  * Set your **calvin_dataset_path** in ```FeedbackPolicy/train_calvin.sh```\n\n- ### Initiate training of the Feedback-driven Policy (Inverse Dynamics Model) on CALVIN\n```\ncd ./FeedbackPolicy\nbash train_calvin.sh\n```\n\n\n## Evaluation \u003ca name=\"Evaluation\"\u003e\u003c/a\u003e\n\n- ### Preparation\n\n    1. Set your CALVIN and checkpoint path at *FeedbackPolicy/eval_calvin.sh*\n    2. We train our policy with input size of 192*192, please modify the config file correspondingly in [VC-1 Config](https://github.com/facebookresearch/eai-vc/blob/76fe35e87b1937168f1ec4b236e863451883eaf3/vc_models/src/vc_models/conf/model/vc1_vitb.yaml#L7) with `img_size: 192` and `use_cls: False`.\n\n- ### Initiate evaluation on CALVIN simply with\n\n```bash\ncd ./FeedbackPolicy\nbash eval_calvin.sh\n```\n    \n\n\n## :pencil: Citation\n\nIf you find the project helpful for your research, please consider citing our paper:\n\n```bibtex\n@article{bu2024clover,\n  title={Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation},\n  author={Bu, Qingwen and Zeng, Jia and Chen, Li and Yang, Yanchao and Zhou, Guyue and Yan, Junchi and Luo, Ping and Cui, Heming and Ma, Yi and Li, Hongyang},\n  journal={arXiv preprint arXiv:2409.09016},\n  year={2024}\n}\n```\n\n## Acknowledgements\n\nWe thank [AVDC](https://github.com/flow-diffusion/AVDC) and [RoboFlamingo](https://github.com/RoboFlamingo/RoboFlamingo) for their open-sourced work!\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopendrivelab%2Fclover","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopendrivelab%2Fclover","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopendrivelab%2Fclover/lists"}