{"id":19846016,"url":"https://github.com/vchitect/vchitect-2.0","last_synced_at":"2025-05-15T07:03:47.301Z","repository":{"id":257113079,"uuid":"854572768","full_name":"Vchitect/Vchitect-2.0","owner":"Vchitect","description":"Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models","archived":false,"fork":false,"pushed_at":"2025-03-17T14:18:12.000Z","size":94139,"stargazers_count":910,"open_issues_count":5,"forks_count":21,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-05-15T07:02:51.811Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://vchitect.intern-ai.org.cn/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Vchitect.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-09-09T12:18:28.000Z","updated_at":"2025-05-08T15:42:07.000Z","dependencies_parsed_at":"2025-04-14T09:08:53.497Z","dependency_job_id":null,"html_url":"https://github.com/Vchitect/Vchitect-2.0","commit_stats":null,"previous_names":["vchitect/vchitect-2.0"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Vchitect%2FVchitect-2.0","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Vchitect%2FVchitect-2.0/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Vchitect%2FVchitect-2.0/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Vchitect%2FVchitect-2.0/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Vchitect","download_url":"https://codeload.github.com/Vchitect/Vchitect-2.0/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254291961,"owners_count":22046424,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-12T13:10:05.079Z","updated_at":"2025-05-15T07:03:47.277Z","avatar_url":"https://github.com/Vchitect.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models\n\n\u003c!-- \u003cp align=\"center\" width=\"100%\"\u003e\n\u003cimg src=\"ISEKAI_overview.png\"  width=\"80%\" height=\"80%\"\u003e\n\u003c/p\u003e --\u003e\n\n\u003cdiv\u003e\n\u003cdiv align=\"center\"\u003e\n    \u003ca href='https://vchitect.intern-ai.org.cn/' target='_blank'\u003eVchitect Team\u003csup\u003e1\u003c/sup\u003e\u003c/a\u003e\u0026emsp;\n\u003c/div\u003e\n\u003cdiv\u003e\n\u003cdiv align=\"center\"\u003e\n    \u003csup\u003e1\u003c/sup\u003eShanghai Artificial Intelligence Laboratory\u0026emsp;\n\u003c/div\u003e\n \n \n\u003cdiv align=\"center\"\u003e\n                      \u003ca href=\"https://arxiv.org/abs/2501.08453\"\u003ePaper\u003c/a\u003e | \n                      \u003ca href=\"https://vchitect.intern-ai.org.cn/\"\u003eProject Page\u003c/a\u003e |\n                      \u003ca href=\"https://huggingface.co/datasets/Vchitect/Vchitect_T2V_DataVerse\"\u003eDataset\u003c/a\u003e\n\u003c/div\u003e\n\n---\n\n![](https://img.shields.io/badge/Vchitect2.0-v0.1-darkcyan)\n![](https://img.shields.io/github/stars/Vchitect/Vchitect-2.0)\n[![Hits](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2FVchitect%2FVchitect-2.0\u0026count_bg=%23BDC4B7\u0026title_bg=%2342C4A8\u0026icon=octopusdeploy.svg\u0026icon_color=%23E7E7E7\u0026title=visitors\u0026edge_flat=true)](https://hits.seeyoufarm.com)\n[![Generic badge](https://img.shields.io/badge/DEMO-Vchitect2.0_Demo-\u003cCOLOR\u003e.svg)](https://huggingface.co/spaces/Vchitect/Vchitect-2.0)\n[![Generic badge](https://img.shields.io/badge/Checkpoint-red.svg)](https://huggingface.co/Vchitect/Vchitect-XL-2B)\n\n\n\n\n\n## 🔥 Update and News\n- [2025.03.17] 🔥 Our [Vchitect-T2V-Dataverse](https://huggingface.co/datasets/Vchitect/Vchitect_T2V_DataVerse) is released.\n- [2025.01.25] Our [paper](https://arxiv.org/abs/2501.08453) is released.\n- [2024.09.14] Inference code and [checkpoint](https://huggingface.co/Vchitect/Vchitect-XL-2B) are released.\n\n## :astonished: Gallery\n\n\u003ctable class=\"center\"\u003e\n\n\u003ctr\u003e\n\n  \u003ctd\u003e\u003cimg src=\"assets/samples/sample_0_seed3.gif\"\u003e \u003c/td\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/samples/sample_1_seed3.gif\"\u003e \u003c/td\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/samples/sample_3_seed2.gif\"\u003e \u003c/td\u003e \n\u003c/tr\u003e\n\n\n        \n\u003ctr\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/samples/sample_4_seed1.gif\"\u003e \u003c/td\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/samples/sample_4_seed4.gif\"\u003e \u003c/td\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/samples/sample_5_seed4.gif\"\u003e \u003c/td\u003e     \n\u003c/tr\u003e\n\n\u003ctr\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/samples/sample_6_seed4.gif\"\u003e \u003c/td\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/samples/sample_8_seed0.gif\"\u003e \u003c/td\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/samples/sample_8_seed2.gif\"\u003e \u003c/td\u003e      \n\u003c/tr\u003e\n\n\u003ctr\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/samples/sample_12_seed1.gif\"\u003e \u003c/td\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/samples/sample_13_seed3.gif\"\u003e \u003c/td\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/samples/sample_14.gif\"\u003e \u003c/td\u003e    \n\u003c/tr\u003e\n\n\u003c/table\u003e\n\n\n## Installation\n\n### 1. Create a conda environment and install PyTorch\n\nNote: You may want to adjust the CUDA version [according to your driver version](https://docs.nvidia.com/deploy/cuda-compatibility/#default-to-minor-version).\n\n  ```bash\n  conda create -n VchitectXL -y\n  conda activate VchitectXL\n  conda install python=3.11 pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia -y\n  ```\n\n### 2. Install dependencies\n\n  ```bash\n  pip install -r requirements.txt\n  ```\n\n## Inference\n**First download the [checkpoint](https://huggingface.co/Vchitect/Vchitect-XL-2B).**\n~~~bash\n\nsave_dir=$1\nckpt_path=$2\n\npython inference.py --test_file assets/test.txt --save_dir \"${save_dir}\" --ckpt_path \"${ckpt_path}\"\n~~~\n\nIn inference.py, arguments for inference:\n  - **num_inference_steps**: Denoising steps, default is 100\n  - **guidance_scale**: CFG scale to use, default is 7.5\n  - **width**: The width of the output video, default is 768\n  - **height**: The height of the output video, default is 432\n  - **frames**: The number of frames, default is 40\n\nThe results below were generated using the example prompt.\n\n\u003ctable class=\"center\"\u003e\n\n\u003ctr\u003e\n\n  \u003c!-- \u003ctd\u003e\u003cimg src=\"assets/samples/sample_0_seed2.gif\"\u003e \u003c/td\u003e --\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/samples/sample_31_seed0.gif\"\u003e \u003c/td\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/samples/sample_33_seed2.gif\"\u003e \u003c/td\u003e \n\u003c/tr\u003e\n\n\u003ctr\u003e\n  \u003c!-- \u003ctd\u003eThere is a painting depicting a turtle swimming in ocean.\u003c/td\u003e --\u003e\n  \u003ctd\u003eA snowy forest landscape with a dirt road running through it. The road is flanked by trees covered in snow, and the ground is also covered in snow. The sun is shining, creating a bright and serene atmosphere. The road appears to be empty, and there are no people or animals visible in the video. \u003c/td\u003e\n  \u003ctd\u003eThe video opens with a breathtaking view of a starry sky and vibrant auroras. The camera pans to reveal a glowing black hole surrounded by swirling, luminescent gas and dust. Below, an enchanted forest of bioluminescent trees glows softly. The scene is a mesmerizing blend of cosmic wonder and magical landscape.\u003c/td\u003e      \n\u003c/tr\u003e\n\u003c/table\u003e\n\n\n\n\nThe base T2V model supports generating videos with resolutions up to 720x480 and 8fps. Then，[VEnhancer](https://github.com/Vchitect/VEnhancer) is used to upscale the resolution to 2K and interpolate the frame rate to 24fps.\n\n## BibTex\n```\n@article{fan2025vchitect,\n  title={Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models},\n  author={Fan, Weichen and Si, Chenyang and Song, Junhao and Yang, Zhenyu and He, Yinan and Zhuo, Long and Huang, Ziqi and Dong, Ziyue and He, Jingwen and Pan, Dongwei and others},\n  journal={arXiv preprint arXiv:2501.08453},\n  year={2025}\n}\n```\n\n## 🔑 License\n\nThis code is licensed under Apache-2.0. The framework is fully open for academic research and also allows free commercial usage.\n\n\n## Disclaimer\n\nWe disclaim responsibility for user-generated content. The model was not trained to realistically represent people or events, so using it to generate such content is beyond the model's capabilities. It is prohibited for pornographic, violent and bloody content generation, and to generate content that is demeaning or harmful to people or their environment, culture, religion, etc. Users are solely liable for their actions. The project contributors are not legally affiliated with, nor accountable for users' behaviors. Use the generative model responsibly, adhering to ethical and legal standards.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvchitect%2Fvchitect-2.0","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvchitect%2Fvchitect-2.0","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvchitect%2Fvchitect-2.0/lists"}