{"id":15581388,"url":"https://github.com/quantumiracle/consistency_model_for_reinforcement_learning","last_synced_at":"2025-04-28T17:19:38.393Z","repository":{"id":221088375,"uuid":"753411966","full_name":"quantumiracle/Consistency_Model_For_Reinforcement_Learning","owner":"quantumiracle","description":"Official implementation for:  Consistency Models as a Rich and Efficient Policy Class for Reinforcement Learning ICLR'24","archived":false,"fork":false,"pushed_at":"2024-08-28T18:11:30.000Z","size":50,"stargazers_count":25,"open_issues_count":0,"forks_count":2,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-28T17:19:30.774Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/quantumiracle.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-02-06T04:09:43.000Z","updated_at":"2025-04-09T03:27:12.000Z","dependencies_parsed_at":"2024-02-15T17:46:27.933Z","dependency_job_id":"85bd98f3-0c97-4138-90e2-10387f21a82b","html_url":"https://github.com/quantumiracle/Consistency_Model_For_Reinforcement_Learning","commit_stats":null,"previous_names":["quantumiracle/consistency_model_for_reinforcement_learning"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quantumiracle%2FConsistency_Model_For_Reinforcement_Learning","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quantumiracle%2FConsistency_Model_For_Reinforcement_Learning/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quantumiracle%2FConsistency_Model_For_Reinforcement_Learning/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quantumiracle%2FConsistency_Model_For_Reinforcement_Learning/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/quantumiracle","download_url":"https://codeload.github.com/quantumiracle/Consistency_Model_For_Reinforcement_Learning/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251352638,"owners_count":21575865,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-02T19:43:28.668Z","updated_at":"2025-04-28T17:19:38.370Z","avatar_url":"https://github.com/quantumiracle.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Consistency Models for RL \u0026mdash; Official PyTorch Implementation\nOfficial implementation for:\n\n**Consistency Models as a Rich and Efficient Policy Class for Reinforcement Learning**\u003cbr\u003e\nZihan Ding, Chi Jin \u003cbr\u003e\n[https://arxiv.org/abs/2309.16984](https://arxiv.org/abs/2309.16984) \u003cbr\u003e\n\n## Requirements\nInstallations of [PyTorch](https://pytorch.org/), [MuJoCo](https://github.com/deepmind/mujoco), and [D4RL](https://github.com/Farama-Foundation/D4RL) are needed. Please see the ``requirements.txt`` for environment set up details.\n```\npip install -r requirements.txt\n```\n\n## Run\nYou can use either diffusion model or consistency model.\n\n### Dataset\nFirst download D4RL dataset with:\n```\npython download_data.py\n```\nThe data will be saved in `./dataset/`.\n\n### Offline RL\n```\n# train offline RL Consistency-AC for hopper-medium-v2 task\npython offline.py --env_name hopper-medium-v2 --model consistency --ms offline --exp RUN_NAME --save_best_model --lr_decay\n# train offline RL Diffusion-QL for walker2d-medium-expert-v2 task\npython offline.py --env_name walker2d-medium-expert-v2 --model diffusion --ms offline --exp RUN_NAME --save_best_model --lr_decay\n```\n### Online RL\nFrom scratch:\n```\n# train online RL Consistency-AC for hopper-medium-v2 task\npython online.py --env_name hopper-medium-v2 --num_envs 3 --model consistency --exp RUN_NAME\n# train online RL Diffusion-QL for walker2d-medium-expert-v2 task\npython online.py --env_name walker2d-medium-expert-v2 --num_envs 3 --model diffusion --exp RUN_NAME\n```\nOnline RL initialized with offline pre-trained models (offline-to-online):\n```\npython online.py --env_name kitchen-mixed-v0 --num_envs 3 --model consistency --exp online_test --load_model 'results/**PATH**' --load_id 'online'\n```\nAs an example, with a model saved in path `results/**PATH**/actor_online.pth`, it will be loaded for initializing the online training with the above command.\n\n### Training Scripts\nUse bash scripts:\n```\nbash scripts/offline.sh\nbash scripts/online.sh\n```\n\nUse Slurm scripts:\n```\nsbatch scripts/offline.slurm\nsbatch scripts/online.slurm\nsbatch scripts/offline2online.slurm\n```\n\n\n## Citation\n\nIf you find this open source release useful, please cite in your paper:\n```\n@article{ding2023consistency,\n  title={Consistency Models as a Rich and Efficient Policy Class for Reinforcement Learning},\n  author={Ding, Zihan and Jin, Chi},\n  journal={arXiv preprint arXiv:2309.16984},\n  year={2023}\n}\n```\n\n## Acknowledgement\nWe acknowledge the original official repo of [Diffusion Policy\n](https://github.com/Zhendong-Wang/Diffusion-Policies-for-Offline-RL)\n and corresponding paper: [https://arxiv.org/abs/2208.06193](https://arxiv.org/abs/2208.06193).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquantumiracle%2Fconsistency_model_for_reinforcement_learning","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fquantumiracle%2Fconsistency_model_for_reinforcement_learning","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquantumiracle%2Fconsistency_model_for_reinforcement_learning/lists"}