{"id":13465863,"url":"https://github.com/hpcaitech/Open-Sora","last_synced_at":"2025-03-25T21:30:46.999Z","repository":{"id":225815784,"uuid":"760231710","full_name":"hpcaitech/Open-Sora","owner":"hpcaitech","description":"Open-Sora: Democratizing Efficient Video Production for All","archived":false,"fork":false,"pushed_at":"2024-08-09T03:40:06.000Z","size":132984,"stargazers_count":22080,"open_issues_count":24,"forks_count":2150,"subscribers_count":186,"default_branch":"main","last_synced_at":"2024-10-29T19:09:39.363Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://hpcaitech.github.io/Open-Sora/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hpcaitech.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-02-20T03:01:34.000Z","updated_at":"2024-10-29T16:49:42.000Z","dependencies_parsed_at":"2024-03-07T09:29:48.950Z","dependency_job_id":"56a53314-f7a9-472b-8b52-665a2fc15260","html_url":"https://github.com/hpcaitech/Open-Sora","commit_stats":null,"previous_names":["hpcaitech/open-sora"],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hpcaitech%2FOpen-Sora","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hpcaitech%2FOpen-Sora/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hpcaitech%2FOpen-Sora/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hpcaitech%2FOpen-Sora/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hpcaitech","download_url":"https://codeload.github.com/hpcaitech/Open-Sora/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245546923,"owners_count":20633257,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T15:00:36.384Z","updated_at":"2025-03-25T21:30:46.983Z","avatar_url":"https://github.com/hpcaitech.png","language":"Python","funding_links":[],"categories":["Python","**Section 7** : Large Language Model: Landscape","Table of Contents \u003c!-- omit in toc --\u003e","\u003cspan id=\"video\"\u003eVideo\u003c/span\u003e","Trending LLM Projects","App","视频生成、补帧、摘要","HarmonyOS","📄 Paper List","视频 Video","Poster","2. Open Foundation Models","Open Source Projects","🔧 Utilities \u0026 Miscellaneous"],"sub_categories":["**Open-Source Large Language Models**","Open-source Toolboxes and Foundation Models","\u003cspan id=\"tool\"\u003eLLM (LLM \u0026 Tool)\u003c/span\u003e","网络服务_其他","Windows Manager","Sparse Sequence Modeling","World Models \u0026 Simulation"],"readme":"\u003cp align=\"center\"\u003e\n    \u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/readme/icon.png\" width=\"250\"/\u003e\n\u003c/p\u003e\n\u003cdiv align=\"center\"\u003e\n    \u003ca href=\"https://github.com/hpcaitech/Open-Sora/stargazers\"\u003e\u003cimg src=\"https://img.shields.io/github/stars/hpcaitech/Open-Sora?style=social\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://arxiv.org/abs/2503.09642v1\"\u003e\u003cimg src=\"https://img.shields.io/static/v1?label=Tech Report 2.0\u0026message=Arxiv\u0026color=red\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://arxiv.org/abs/2412.20404\"\u003e\u003cimg src=\"https://img.shields.io/static/v1?label=Tech Report 1.2\u0026message=Arxiv\u0026color=red\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://hpcaitech.github.io/Open-Sora/\"\u003e\u003cimg src=\"https://img.shields.io/badge/Gallery-View-orange?logo=\u0026amp\"\u003e\u003c/a\u003e\n\u003c/div\u003e\n\n\u003cdiv align=\"center\"\u003e\n    \u003ca href=\"https://discord.gg/kZakZzrSUT\"\u003e\u003cimg src=\"https://img.shields.io/badge/Discord-join-blueviolet?logo=discord\u0026amp\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-247ipg9fk-KRRYmUl~u2ll2637WRURVA\"\u003e\u003cimg src=\"https://img.shields.io/badge/Slack-ColossalAI-blueviolet?logo=slack\u0026amp\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://x.com/YangYou1991/status/1899973689460044010\"\u003e\u003cimg src=\"https://img.shields.io/badge/Twitter-Discuss-blue?logo=twitter\u0026amp\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/WeChat.png\"\u003e\u003cimg src=\"https://img.shields.io/badge/微信-小助手加群-green?logo=wechat\u0026amp\"\u003e\u003c/a\u003e\n\u003c/div\u003e\n\n## Open-Sora: Democratizing Efficient Video Production for All\n\nWe design and implement **Open-Sora**, an initiative dedicated to **efficiently** producing high-quality video. We hope to make the model,\ntools and all details accessible to all. By embracing **open-source** principles,\nOpen-Sora not only democratizes access to advanced video generation techniques, but also offers a\nstreamlined and user-friendly platform that simplifies the complexities of video generation.\nWith Open-Sora, our goal is to foster innovation, creativity, and inclusivity within the field of content creation.\n\n🎬 For a professional AI video-generation product, try [Video Ocean](https://video-ocean.com/) — powered by a superior model.\n\u003cdiv align=\"center\"\u003e\n   \u003ca href=\"https://video-ocean.com/\"\u003e\n   \u003cimg src=\"https://github.com/hpcaitech/public_assets/blob/main/colossalai/img/3.gif\" width=\"850\" /\u003e\n   \u003c/a\u003e\n\u003c/div\u003e\n\n\u003cdiv align=\"center\"\u003e\n   \u003ca href=\"https://hpc-ai.com/?utm_source=github\u0026utm_medium=social\u0026utm_campaign=promotion-opensora\"\u003e\n   \u003cimg src=\"https://github.com/hpcaitech/public_assets/blob/main/colossalai/img/1.gif\" width=\"850\" /\u003e\n   \u003c/a\u003e\n\u003c/div\u003e\n\n\u003c!-- [[中文文档](/docs/zh_CN/README.md)] [[潞晨云](https://cloud.luchentech.com/)|[OpenSora镜像](https://cloud.luchentech.com/doc/docs/image/open-sora/)|[视频教程](https://www.bilibili.com/video/BV1ow4m1e7PX/?vd_source=c6b752764cd36ff0e535a768e35d98d2)] --\u003e\n\n## 📰 News\n\n- **[2025.03.12]** 🔥 We released **Open-Sora 2.0** (11B). 🎬 11B model achieves [on-par performance](#evaluation) with 11B HunyuanVideo \u0026 30B Step-Video on 📐VBench \u0026 📊Human Preference. 🛠️ Fully open-source: checkpoints and training codes for training with only **$200K**. [[report]](https://arxiv.org/abs/2503.09642v1)\n- **[2025.02.20]** 🔥 We released **Open-Sora 1.3** (1B). With the upgraded VAE and Transformer architecture, the quality of our generated videos has been greatly improved 🚀. [[checkpoints]](#open-sora-13-model-weights) [[report]](/docs/report_04.md) [[demo]](https://huggingface.co/spaces/hpcai-tech/open-sora)\n- **[2024.12.23]** The development cost of video generation models has saved by 50%! Open-source solutions are now available with H200 GPU vouchers. [[blog]](https://company.hpc-ai.com/blog/the-development-cost-of-video-generation-models-has-saved-by-50-open-source-solutions-are-now-available-with-h200-gpu-vouchers) [[code]](https://github.com/hpcaitech/Open-Sora/blob/main/scripts/train.py) [[vouchers]](https://colossalai.org/zh-Hans/docs/get_started/bonus/)\n- **[2024.06.17]** We released **Open-Sora 1.2**, which includes **3D-VAE**, **rectified flow**, and **score condition**. The video quality is greatly improved. [[checkpoints]](#open-sora-12-model-weights) [[report]](/docs/report_03.md) [[arxiv]](https://arxiv.org/abs/2412.20404)\n- **[2024.04.25]** 🤗 We released the [Gradio demo for Open-Sora](https://huggingface.co/spaces/hpcai-tech/open-sora) on Hugging Face Spaces.\n- **[2024.04.25]** We released **Open-Sora 1.1**, which supports **2s~15s, 144p to 720p, any aspect ratio** text-to-image, **text-to-video, image-to-video, video-to-video, infinite time** generation. In addition, a full video processing pipeline is released. [[checkpoints]](#open-sora-11-model-weights) [[report]](/docs/report_02.md)\n- **[2024.03.18]** We released **Open-Sora 1.0**, a fully open-source project for video generation.\n  Open-Sora 1.0 supports a full pipeline of video data preprocessing, training with\n  \u003ca href=\"https://github.com/hpcaitech/ColossalAI\"\u003e\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/readme/colossal_ai.png\" width=\"8%\" \u003e\u003c/a\u003e\n  acceleration,\n  inference, and more. Our model can produce 2s 512x512 videos with only 3 days training. [[checkpoints]](#open-sora-10-model-weights)\n  [[blog]](https://hpc-ai.com/blog/open-sora-v1.0) [[report]](/docs/report_01.md)\n- **[2024.03.04]** Open-Sora provides training with 46% cost reduction.\n  [[blog]](https://hpc-ai.com/blog/open-sora)\n\n📍 Since Open-Sora is under active development, we remain different branches for different versions. The latest version is [main](https://github.com/hpcaitech/Open-Sora). Old versions include: [v1.0](https://github.com/hpcaitech/Open-Sora/tree/opensora/v1.0), [v1.1](https://github.com/hpcaitech/Open-Sora/tree/opensora/v1.1), [v1.2](https://github.com/hpcaitech/Open-Sora/tree/opensora/v1.2), [v1.3](https://github.com/hpcaitech/Open-Sora/tree/opensora/v1.3).\n\n## 🎥 Latest Demo\n\nDemos are presented in compressed GIF format for convenience. For original quality samples and their corresponding prompts, please visit our [Gallery](https://hpcaitech.github.io/Open-Sora/).\n\n| **5s 1024×576**                                                                                                                                    | **5s 576×1024**                                                                                                                                    | **5s 576×1024**                                                                                                                                   |\n| -------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |\n| [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v2.0/ft_0001_1_1.gif\" width=\"\"\u003e](https://streamable.com/e/8g9y9h?autoplay=1) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v2.0/movie_0160.gif\" width=\"\"\u003e](https://streamable.com/e/k50mnv?autoplay=1)  | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v2.0/movie_0017.gif\" width=\"\"\u003e](https://streamable.com/e/bzrn9n?autoplay=1) |\n| [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v2.0/ft_0012_1_1.gif\" width=\"\"\u003e](https://streamable.com/e/dsv8da?autoplay=1) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v2.0/douyin_0005.gif\" width=\"\"\u003e](https://streamable.com/e/3wif07?autoplay=1) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v2.0/movie_0037.gif\" width=\"\"\u003e](https://streamable.com/e/us2w7h?autoplay=1) |\n| [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v2.0/ft_0055_1_1.gif\" width=\"\"\u003e](https://streamable.com/e/yfwk8i?autoplay=1) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v2.0/sora_0019.gif\" width=\"\"\u003e](https://streamable.com/e/jgjil0?autoplay=1)   | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v2.0/movie_0463.gif\" width=\"\"\u003e](https://streamable.com/e/lsoai1?autoplay=1) |\n\n\u003cdetails\u003e\n\u003csummary\u003eOpenSora 1.3 Demo\u003c/summary\u003e\n\n| **5s 720×1280**                                                                                                                                                        | **5s 720×1280**                                                                                                                                                           | **5s 720×1280**                                                                                                                                                              |\n| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.3/demo_tomato.gif\" width=\"\"\u003e](https://streamable.com/e/r0imrp?quality=highest\u0026amp;autoplay=1) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.3/demo_fisherman.gif\" width=\"\"\u003e](https://streamable.com/e/hfvjkh?quality=highest\u0026amp;autoplay=1) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.3/demo_girl2.gif\" width=\"\"\u003e](https://streamable.com/e/kutmma?quality=highest\u0026amp;autoplay=1)        |\n| [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.3/demo_grape.gif\" width=\"\"\u003e](https://streamable.com/e/osn1la?quality=highest\u0026amp;autoplay=1)  | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.3/demo_mushroom.gif\" width=\"\"\u003e](https://streamable.com/e/l1pzws?quality=highest\u0026amp;autoplay=1)  | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.3/demo_parrot.gif\" width=\"\"\u003e](https://streamable.com/e/2vqari?quality=highest\u0026amp;autoplay=1)       |\n| [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.3/demo_trans.gif\" width=\"\"\u003e](https://streamable.com/e/1in7d6?quality=highest\u0026amp;autoplay=1)  | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.3/demo_bear.gif\" width=\"\"\u003e](https://streamable.com/e/e9bi4o?quality=highest\u0026amp;autoplay=1)      | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.3/demo_futureflower.gif\" width=\"\"\u003e](https://streamable.com/e/09z7xi?quality=highest\u0026amp;autoplay=1) |\n| [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.3/demo_fire.gif\" width=\"\"\u003e](https://streamable.com/e/16c3hk?quality=highest\u0026amp;autoplay=1)   | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.3/demo_man.gif\" width=\"\"\u003e](https://streamable.com/e/wi250w?quality=highest\u0026amp;autoplay=1)       | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.3/demo_black.gif\" width=\"\"\u003e](https://streamable.com/e/vw5b64?quality=highest\u0026amp;autoplay=1)        |\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003eOpenSora 1.2 Demo\u003c/summary\u003e\n\n| **4s 720×1280**                                                                                                                                                                                     | **4s 720×1280**                                                                                                                                                                                     | **4s 720×1280**                                                                                                                                                                                     |\n| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.2/sample_0013.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora/assets/99191637/7895aab6-ed23-488c-8486-091480c26327) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.2/sample_1718.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora/assets/99191637/20f07c7b-182b-4562-bbee-f1df74c86c9a) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.2/sample_0087.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora/assets/99191637/3d897e0d-dc21-453a-b911-b3bda838acc2) |\n| [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.2/sample_0052.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora/assets/99191637/644bf938-96ce-44aa-b797-b3c0b513d64c) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.2/sample_1719.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora/assets/99191637/272d88ac-4b4a-484d-a665-8d07431671d0) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.2/sample_0002.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora/assets/99191637/ebbac621-c34e-4bb4-9543-1c34f8989764) |\n| [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.2/sample_0011.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora/assets/99191637/a1e3a1a3-4abd-45f5-8df2-6cced69da4ca) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.2/sample_0004.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora/assets/99191637/d6ce9c13-28e1-4dff-9644-cc01f5f11926) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.2/sample_0061.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora/assets/99191637/561978f8-f1b0-4f4d-ae7b-45bec9001b4a) |\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003eOpenSora 1.1 Demo\u003c/summary\u003e\n\n| **2s 240×426**                                                                                                                                                                                                  | **2s 240×426**                                                                                                                                                                                                 |\n| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.1/sample_16x240x426_9.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/c31ebc52-de39-4a4e-9b1e-9211d45e05b2) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.1/sora_16x240x426_26.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/c31ebc52-de39-4a4e-9b1e-9211d45e05b2) |\n| [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.1/sora_16x240x426_27.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/f7ce4aaa-528f-40a8-be7a-72e61eaacbbd)  | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.1/sora_16x240x426_40.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/5d58d71e-1fda-4d90-9ad3-5f2f7b75c6a9) |\n\n| **2s 426×240**                                                                                                                                                                                                 | **4s 480×854**                                                                                                                                                                                                  |\n| -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.1/sora_16x426x240_24.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/34ecb4a0-4eef-4286-ad4c-8e3a87e5a9fd) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.1/sample_32x480x854_9.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/c1619333-25d7-42ba-a91c-18dbc1870b18) |\n\n| **16s 320×320**                                                                                                                                                                                            | **16s 224×448**                                                                                                                                                                                            | **2s 426×240**                                                                                                                                                                                                |\n| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.1/sample_16s_320x320.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora/assets/99191637/3cab536e-9b43-4b33-8da8-a0f9cf842ff2) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.1/sample_16s_224x448.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora/assets/99191637/9fb0b9e0-c6f4-4935-b29e-4cac10b373c4) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.1/sora_16x426x240_3.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/3e892ad2-9543-4049-b005-643a4c1bf3bf) |\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003eOpenSora 1.0 Demo\u003c/summary\u003e\n\n| **2s 512×512**                                                                                                                                                                                   | **2s 512×512**                                                                                                                                                                                   | **2s 512×512**                                                                                                                                                                                   |\n| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |\n| [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.0/sample_0.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora/assets/99191637/de1963d3-b43b-4e68-a670-bb821ebb6f80) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.0/sample_1.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora/assets/99191637/13f8338f-3d42-4b71-8142-d234fbd746cc) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.0/sample_2.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora/assets/99191637/fa6a65a6-e32a-4d64-9a9e-eabb0ebb8c16) |\n| A serene night scene in a forested area. [...] The video is a time-lapse, capturing the transition from day to night, with the lake and forest serving as a constant backdrop.                   | A soaring drone footage captures the majestic beauty of a coastal cliff, [...] The water gently laps at the rock base and the greenery that clings to the top of the cliff.                      | The majestic beauty of a waterfall cascading down a cliff into a serene lake. [...] The camera angle provides a bird's eye view of the waterfall.                                                |\n| [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.0/sample_3.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora/assets/99191637/64232f84-1b36-4750-a6c0-3e610fa9aa94) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.0/sample_4.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora/assets/99191637/983a1965-a374-41a7-a76b-c07941a6c1e9) | [\u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v1.0/sample_5.gif\" width=\"\"\u003e](https://github.com/hpcaitech/Open-Sora/assets/99191637/ec10c879-9767-4c31-865f-2e8d6cf11e65) |\n| A bustling city street at night, filled with the glow of car headlights and the ambient light of streetlights. [...]                                                                             | The vibrant beauty of a sunflower field. The sunflowers are arranged in neat rows, creating a sense of order and symmetry. [...]                                                                 | A serene underwater scene featuring a sea turtle swimming through a coral reef. The turtle, with its greenish-brown shell [...]                                                                  |\n\nVideos are downsampled to `.gif` for display. Click for original videos. Prompts are trimmed for display,\nsee [here](/assets/texts/t2v_samples.txt) for full prompts.\n\n\u003c/details\u003e\n\n## 🔆 Reports\n\n- **[Tech Report of Open-Sora 2.0](https://arxiv.org/abs/2503.09642v1)**\n- **[Step by step to train or finetune your own model](docs/train.md)**\n- **[Step by step to train and evaluate an video autoencoder](docs/ae.md)**\n- **[Visit the high compression video autoencoder](docs/hcae.md)**\n- Reports of previous version (better see in according branch):\n  - [Open-Sora 1.3](docs/report_04.md): shift-window attention, unified spatial-temporal VAE, etc.\n  - [Open-Sora 1.2](docs/report_03.md), [Tech Report](https://arxiv.org/abs/2412.20404): rectified flow, 3d-VAE, score condition, evaluation, etc.\n  - [Open-Sora 1.1](docs/report_02.md): multi-resolution/length/aspect-ratio, image/video conditioning/editing, data preprocessing, etc.\n  - [Open-Sora 1.0](docs/report_01.md): architecture, captioning, etc.\n\n📍 Since Open-Sora is under active development, we remain different branches for different versions. The latest version is [main](https://github.com/hpcaitech/Open-Sora). Old versions include: [v1.0](https://github.com/hpcaitech/Open-Sora/tree/opensora/v1.0), [v1.1](https://github.com/hpcaitech/Open-Sora/tree/opensora/v1.1), [v1.2](https://github.com/hpcaitech/Open-Sora/tree/opensora/v1.2), [v1.3](https://github.com/hpcaitech/Open-Sora/tree/opensora/v1.3).\n\n## Quickstart\n\n### Installation\n\n```bash\n# create a virtual env and activate (conda as an example)\nconda create -n opensora python=3.10\nconda activate opensora\n\n# download the repo\ngit clone https://github.com/hpcaitech/Open-Sora\ncd Open-Sora\n\n# Ensure torch \u003e= 2.4.0\npip install -v . # for development mode, `pip install -v -e .`\npip install xformers==0.0.27.post2 --index-url https://download.pytorch.org/whl/cu121 # install xformers according to your cuda version\npip install flash-attn --no-build-isolation\n```\n\nOptionally, you can install flash attention 3 for faster speed.\n\n```bash\ngit clone https://github.com/Dao-AILab/flash-attention # 4f0640d5\ncd flash-attention/hopper\npython setup.py install\n```\n\n### Model Download\n\nOur 11B model supports 256px and 768px resolution. Both T2V and I2V are supported by one model. 🤗 [Huggingface](https://huggingface.co/hpcai-tech/Open-Sora-v2) 🤖 [ModelScope](https://modelscope.cn/models/luchentech/Open-Sora-v2).\n\nDownload from huggingface:\n\n```bash\npip install \"huggingface_hub[cli]\"\nhuggingface-cli download hpcai-tech/Open-Sora-v2 --local-dir ./ckpts\n```\n\nDownload from ModelScope:\n\n```bash\npip install modelscope\nmodelscope download hpcai-tech/Open-Sora-v2 --local_dir ./ckpts\n```\n\n### Text-to-Video Generation\n\nOur model is optimized for image-to-video generation, but it can also be used for text-to-video generation. To generate high quality videos, with the help of flux text-to-image model, we build a text-to-image-to-video pipeline. For 256x256 resolution:\n\n```bash\n# Generate one given prompt\ntorchrun --nproc_per_node 1 --standalone scripts/diffusion/inference.py configs/diffusion/inference/t2i2v_256px.py --save-dir samples --prompt \"raining, sea\"\n\n# Save memory with offloading\ntorchrun --nproc_per_node 1 --standalone scripts/diffusion/inference.py configs/diffusion/inference/t2i2v_256px.py --save-dir samples --prompt \"raining, sea\" --offload True\n\n# Generation with csv\ntorchrun --nproc_per_node 1 --standalone scripts/diffusion/inference.py configs/diffusion/inference/t2i2v_256px.py --save-dir samples --dataset.data-path assets/texts/example.csv\n```\n\nFor 768x768 resolution:\n\n```bash\n# One GPU\ntorchrun --nproc_per_node 1 --standalone scripts/diffusion/inference.py configs/diffusion/inference/t2i2v_768px.py --save-dir samples --prompt \"raining, sea\"\n\n# Multi-GPU with colossalai sp\ntorchrun --nproc_per_node 8 --standalone scripts/diffusion/inference.py configs/diffusion/inference/t2i2v_768px.py --save-dir samples --prompt \"raining, sea\"\n```\n\nYou can adjust the generation aspect ratio by `--aspect_ratio` and the generation length by `--num_frames`. Candidate values for aspect_ratio includes `16:9`, `9:16`, `1:1`, `2.39:1`. Candidate values for num_frames should be `4k+1` and less than 129.\n\nYou can also run direct text-to-video by:\n\n```bash\n# One GPU for 256px\ntorchrun --nproc_per_node 1 --standalone scripts/diffusion/inference.py configs/diffusion/inference/256px.py --prompt \"raining, sea\"\n# Multi-GPU for 768px\ntorchrun --nproc_per_node 8 --standalone scripts/diffusion/inference.py configs/diffusion/inference/768px.py --prompt \"raining, sea\"\n```\n\n### Image-to-Video Generation\n\nGiven a prompt and a reference image, you can generate a video with the following command:\n\n```bash\n# 256px\ntorchrun --nproc_per_node 1 --standalone scripts/diffusion/inference.py configs/diffusion/inference/256px.py --cond_type i2v_head --prompt \"A plump pig wallows in a muddy pond on a rustic farm, its pink snout poking out as it snorts contentedly. The camera captures the pig's playful splashes, sending ripples through the water under the midday sun. Wooden fences and a red barn stand in the background, framed by rolling green hills. The pig's muddy coat glistens in the sunlight, showcasing the simple pleasures of its carefree life.\" --ref assets/texts/i2v.png\n\n# 256px with csv\ntorchrun --nproc_per_node 1 --standalone scripts/diffusion/inference.py configs/diffusion/inference/256px.py --cond_type i2v_head --dataset.data-path assets/texts/i2v.csv\n\n# Multi-GPU 768px\ntorchrun --nproc_per_node 8 --standalone scripts/diffusion/inference.py configs/diffusion/inference/768px.py --cond_type i2v_head --dataset.data-path assets/texts/i2v.csv\n```\n\n## Advanced Usage\n\n### Motion Score\n\nDuring training, we provide motion score into the text prompt. During inference, you can use the following command to generate videos with motion score (the default score is 4):\n\n```bash\ntorchrun --nproc_per_node 1 --standalone scripts/diffusion/inference.py configs/diffusion/inference/t2i2v_256px.py --save-dir samples --prompt \"raining, sea\" --motion-score 4\n```\n\nWe also provide a dynamic motion score evaluator. After setting your OpenAI API key, you can use the following command to evaluate the motion score of a video:\n\n```bash\ntorchrun --nproc_per_node 1 --standalone scripts/diffusion/inference.py configs/diffusion/inference/t2i2v_256px.py --save-dir samples --prompt \"raining, sea\" --motion-score dynamic\n```\n\n| Score | 1                                                                                                       | 4                                                                                                       | 7                                                                                                       |\n| ----- | ------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- |\n|       | \u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v2.0/motion_score_1.gif\" width=\"\"\u003e | \u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v2.0/motion_score_4.gif\" width=\"\"\u003e | \u003cimg src=\"https://github.com/hpcaitech/Open-Sora-Demo/blob/main/demo/v2.0/motion_score_7.gif\" width=\"\"\u003e |\n\n### Prompt Refine\n\nWe take advantage of ChatGPT to refine the prompt. You can use the following command to refine the prompt. The function is available for both text-to-video and image-to-video generation.\n\n```bash\nexport OPENAI_API_KEY=sk-xxxx\ntorchrun --nproc_per_node 1 --standalone scripts/diffusion/inference.py configs/diffusion/inference/t2i2v_256px.py --save-dir samples --prompt \"raining, sea\" --refine-prompt True\n```\n\n### Reproductivity\n\nTo make the results reproducible, you can set the random seed by:\n\n```bash\ntorchrun --nproc_per_node 1 --standalone scripts/diffusion/inference.py configs/diffusion/inference/t2i2v_256px.py --save-dir samples --prompt \"raining, sea\" --sampling_option.seed 42 --seed 42\n```\n\nUse `--num-sample k` to generate `k` samples for each prompt.\n\n## Computational Efficiency\n\nWe test the computational efficiency of text-to-video on H100/H800 GPU. For 256x256, we use colossalai's tensor parallelism, and `--offload True` is used. For 768x768, we use colossalai's sequence parallelism. All use number of steps 50. The results are presented in the format: $\\color{blue}{\\text{Total time (s)}}/\\color{red}{\\text{peak GPU memory (GB)}}$\n\n| Resolution | 1x GPU                                 | 2x GPUs                               | 4x GPUs                               | 8x GPUs                               |\n| ---------- | -------------------------------------- | ------------------------------------- | ------------------------------------- | ------------------------------------- |\n| 256x256    | $\\color{blue}{60}/\\color{red}{52.5}$   | $\\color{blue}{40}/\\color{red}{44.3}$  | $\\color{blue}{34}/\\color{red}{44.3}$  |                                       |\n| 768x768    | $\\color{blue}{1656}/\\color{red}{60.3}$ | $\\color{blue}{863}/\\color{red}{48.3}$ | $\\color{blue}{466}/\\color{red}{44.3}$ | $\\color{blue}{276}/\\color{red}{44.3}$ |\n\n## Evaluation\n\nOn [VBench](https://huggingface.co/spaces/Vchitect/VBench_Leaderboard), Open-Sora 2.0 significantly narrows the gap with OpenAI’s Sora, reducing it from 4.52% → 0.69% compared to Open-Sora 1.2.\n\n![VBench](https://github.com/hpcaitech/Open-Sora-Demo/blob/main/readme/v2_vbench.png)\n\nHuman preference results show our model is on par with HunyuanVideo 11B and Step-Video 30B.\n\n![Win Rate](https://github.com/hpcaitech/Open-Sora-Demo/blob/main/readme/v2_winrate.png)\n\nWith strong performance, Open-Sora 2.0 is cost-effective.\n\n![Cost](https://github.com/hpcaitech/Open-Sora-Demo/blob/main/readme/v2_cost.png)\n\n## Contribution\n\nThanks goes to these wonderful contributors:\n\n\u003ca href=\"https://github.com/hpcaitech/Open-Sora/graphs/contributors\"\u003e\n  \u003cimg src=\"https://contrib.rocks/image?repo=hpcaitech/Open-Sora\" /\u003e\n\u003c/a\u003e\n\nIf you wish to contribute to this project, please refer to the [Contribution Guideline](./CONTRIBUTING.md).\n\n## Acknowledgement\n\nHere we only list a few of the projects. For other works and datasets, please refer to our report.\n\n- [ColossalAI](https://github.com/hpcaitech/ColossalAI): A powerful large model parallel acceleration and optimization\n  system.\n- [DiT](https://github.com/facebookresearch/DiT): Scalable Diffusion Models with Transformers.\n- [OpenDiT](https://github.com/NUS-HPC-AI-Lab/OpenDiT): An acceleration for DiT training. We adopt valuable acceleration\n  strategies for training progress from OpenDiT.\n- [PixArt](https://github.com/PixArt-alpha/PixArt-alpha): An open-source DiT-based text-to-image model.\n- [Flux](https://github.com/black-forest-labs/flux): A powerful text-to-image generation model.\n- [Latte](https://github.com/Vchitect/Latte): An attempt to efficiently train DiT for video.\n- [HunyuanVideo](https://github.com/Tencent/HunyuanVideo/tree/main?tab=readme-ov-file): Open-Source text-to-video model.\n- [StabilityAI VAE](https://huggingface.co/stabilityai/sd-vae-ft-mse-original): A powerful image VAE model.\n- [DC-AE](https://github.com/mit-han-lab/efficientvit): Deep Compression AutoEncoder for image compression.\n- [CLIP](https://github.com/openai/CLIP): A powerful text-image embedding model.\n- [T5](https://github.com/google-research/text-to-text-transfer-transformer): A powerful text encoder.\n- [LLaVA](https://github.com/haotian-liu/LLaVA): A powerful image captioning model based on [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1) and [Yi-34B](https://huggingface.co/01-ai/Yi-34B).\n- [PLLaVA](https://github.com/magic-research/PLLaVA): A powerful video captioning model.\n- [MiraData](https://github.com/mira-space/MiraData): A large-scale video dataset with long durations and structured caption.\n\n## Citation\n\n```bibtex\n@article{opensora,\n  title={Open-sora: Democratizing efficient video production for all},\n  author={Zheng, Zangwei and Peng, Xiangyu and Yang, Tianji and Shen, Chenhui and Li, Shenggui and Liu, Hongxin and Zhou, Yukun and Li, Tianyi and You, Yang},\n  journal={arXiv preprint arXiv:2412.20404},\n  year={2024}\n}\n\n@article{opensora2,\n    title={Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k}, \n    author={Xiangyu Peng and Zangwei Zheng and Chenhui Shen and Tom Young and Xinying Guo and Binluo Wang and Hang Xu and Hongxin Liu and Mingyan Jiang and Wenjun Li and Yuhui Wang and Anbang Ye and Gang Ren and Qianran Ma and Wanying Liang and Xiang Lian and Xiwen Wu and Yuting Zhong and Zhuangyan Li and Chaoyu Gong and Guojun Lei and Leijun Cheng and Limin Zhang and Minghao Li and Ruijie Zhang and Silan Hu and Shijie Huang and Xiaokang Wang and Yuanheng Zhao and Yuqi Wang and Ziang Wei and Yang You},\n    year={2025},\n    journal={arXiv preprint arXiv:2503.09642},\n}\n```\n\n## Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=hpcaitech/Open-Sora\u0026type=Date)](https://star-history.com/#hpcaitech/Open-Sora\u0026Date)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhpcaitech%2FOpen-Sora","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhpcaitech%2FOpen-Sora","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhpcaitech%2FOpen-Sora/lists"}