{"id":31055182,"url":"https://github.com/rehglab/rave","last_synced_at":"2026-01-27T04:01:03.952Z","repository":{"id":211367012,"uuid":"727519007","full_name":"RehgLab/RAVE","owner":"RehgLab","description":"RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models [CVPR 2024]","archived":false,"fork":false,"pushed_at":"2025-02-11T18:43:26.000Z","size":164125,"stargazers_count":314,"open_issues_count":3,"forks_count":20,"subscribers_count":8,"default_branch":"main","last_synced_at":"2026-01-15T03:42:00.267Z","etag":null,"topics":["diffusion","stable-diffusion","video-editing"],"latest_commit_sha":null,"homepage":"https://rave-video.github.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RehgLab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-12-05T02:51:53.000Z","updated_at":"2025-12-11T02:32:31.000Z","dependencies_parsed_at":"2025-03-28T02:46:31.858Z","dependency_job_id":null,"html_url":"https://github.com/RehgLab/RAVE","commit_stats":null,"previous_names":["rehg-lab/rave","rehglab/rave"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/RehgLab/RAVE","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RehgLab%2FRAVE","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RehgLab%2FRAVE/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RehgLab%2FRAVE/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RehgLab%2FRAVE/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RehgLab","download_url":"https://codeload.github.com/RehgLab/RAVE/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RehgLab%2FRAVE/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28800890,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-27T03:44:14.111Z","status":"ssl_error","status_checked_at":"2026-01-27T03:43:33.507Z","response_time":168,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["diffusion","stable-diffusion","video-editing"],"created_at":"2025-09-15T04:04:35.489Z","updated_at":"2026-01-27T04:01:03.875Z","avatar_url":"https://github.com/RehgLab.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"### RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models - Official Repo\n### CVPR 2024 (Highlight)\n\n[Ozgur Kara](https://karaozgur.com/), [Bariscan Kurtkaya](https://bariscankurtkaya.github.io/), [Hidir Yesiltepe](https://sites.google.com/view/hidir-yesiltepe), [James M. Rehg](https://scholar.google.com/citations?hl=en\u0026user=8kA3eDwAAAAJ), [Pinar Yanardag](https://scholar.google.com/citations?user=qzczdd8AAAAJ\u0026hl=en)\n\n\u003ca href=\"https://huggingface.co/spaces/ozgurkara/RAVE\"\u003e\u003cimg src=\"https://huggingface.co/datasets/huggingface/badges/raw/main/open-in-hf-spaces-sm-dark.svg\" alt=\"Web Demo\"\u003e\n\u003ca href='https://arxiv.org/abs/2312.04524'\u003e\u003cimg src='https://img.shields.io/badge/ArXiv-2312.04524-red'\u003e\u003c/a\u003e \n\u003ca href='https://rave-video.github.io/'\u003e\u003cimg src='https://img.shields.io/badge/Project-Page-green'\u003e\u003c/a\u003e\n\u003ca href='https://youtu.be/2hQho5AC9T0?si=3R_jYDbcL2olODCV'\u003e\u003cimg src='https://img.shields.io/badge/YouTube-red?style=for-the-badge\u0026logo=youtube\u0026logoColor=white'\u003e\u003c/a\u003e\n\u003ca href='https://rave-video.github.io/supp/supp.html'\u003e\u003cimg src='https://img.shields.io/badge/Supplementary-Page-yellow'\u003e\u003c/a\u003e\n[![GitHub](https://img.shields.io/github/stars/rehg-lab/RAVE?style=social)](https://github.com/rehg-lab/RAVE)\n\n![Visitors](https://api.visitorbadge.io/api/visitors?path=https%3A%2F%2Fgithub.com%2Frehg-lab%2FRAVE\u0026label=visitors\u0026countColor=%23263759)\n\n\n![teaser](assets/examples/grid-2x3.gif)\n(Note that the videos on GitHub are heavily compressed. The full videos are available on the project webpage.)\n\n## Abstract\n\u003cb\u003eTL; DR:\u003c/b\u003e RAVE is a zero-shot, lightweight, and fast framework for text-guided video editing, supporting videos of any length utilizing text-to-image pretrained diffusion models. \n\n\u003cdetails\u003e\u003csummary\u003eClick for the full abstract\u003c/summary\u003e\n\n\n\u003e Recent advancements in diffusion-based models have demonstrated significant success in generating images from text. However, video editing models have not yet reached the same level of visual quality and user control. To address this, we introduce RAVE, a zero-shot video editing method that leverages pre-trained text-to-image diffusion models without additional training. RAVE takes an input video and a text prompt to produce high-quality videos while preserving the original motion and semantic structure. It employs a novel noise shuffling strategy, leveraging spatio-temporal interactions between frames, to produce temporally consistent videos faster than existing methods. It is also efficient in terms of memory requirements, allowing it to handle longer videos.  RAVE is capable of a wide range of edits, from local attribute modifications to shape transformations. In order to demonstrate the versatility of RAVE, we create a comprehensive video evaluation dataset ranging from object-focused scenes to complex human activities like dancing and typing, and dynamic scenes featuring swimming fish and boats. Our qualitative and quantitative experiments highlight the effectiveness of RAVE in diverse video editing scenarios compared to existing methods.\n\u003c/details\u003e\n\n\u003cbr\u003e\n\n**Features**:\n- *Zero-shot framework*\n- *Working fast*\n- *No restriction on video length*\n- *Standardized dataset for evaluating text-guided video-editing methods*\n- *Compatible with off-the-shelf pre-trained approaches (e.g. [CivitAI](https://civitai.com/))*\n\n\n## Updates\n- [12/2023] Gradio demo is released, HuggingFace Space demo will be released soon\n- [12/2023] Paper is available on ArXiv, project webpage is ready and code is released.\n\n### TODO\n- [ ] Share the dataset\n- [X] Add more examples\n- [X] Optimize preprocessing\n- [X] Add CivitAI models to Grad.io\n- [X] ~~Prepare a grad.io based GUI~~\n- [X] ~~Integrate MultiControlNet~~\n- [X] ~~Adapt CIVIT AI models~~\n\n## Installation and Inference\n\n### Setup Environment\nPlease install our environment using 'requirements.txt' file as:\n```shell\nconda create -n rave python=3.8\nconda activate rave\nconda install pip\npip cache purge\npip install -r requirements.txt\n```\nAlso, please install PyTorch and Xformers as\n```shell\npip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118\npip install xformers==0.0.20\n```\nto set up the Conda environment.\n\nOur code was tested on Linux with the following versions:\n```shell\ntimm==0.6.7 torch==2.0.1+cu118 xformers==0.0.20 diffusers==0.18.2 torch.version.cuda==11.8 python==3.8.0\n```\n\n### WebUI Demo\n\nTo run our grad.io based web demo, run the following command:\n```shell\npython webui.py\n```\nThen, specify your configurations and perform editing.\n\n\n### Inference\n\n\nTo run RAVE, please follow these steps:\n\n1- Put the video you want to edit under `data/mp4_videos` as an MP4 file. Note that we suggest using videos with a size of 512x512 or 512x320.\n\n2- Prepare a config file under the `configs` directory. Change the name of the `video_name` parameter to the name of the MP4 file. You can find detailed descriptions of the parameters and example configurations there.\n\n3- Run the following command:\n```shell\npython scripts/run_experiment.py [PATH OF CONFIG FILE]\n```\n4- The results will be generated under the `results` directory. Also, the latents and controls are saved under the `generated` directory to speed up the editing with different prompts on the same video.\nNote that the names of the preprocessors available can be found in `utils/constants.py`.\n\n### Use Customized Models from CIVIT AI\n\nOur code allows to run any customized model from CIVIT AI. To use these models, please follow the steps:\n\n1- Determine which model you want to use from CIVIT AI, and obtain its index. (e.g. the index for RealisticVision V5.1 is 130072, you can find the id of the model in the website link as a parameter assigned to 'VersionId', e.g. https://civitai.com/models/4201?modelVersionId=130072)\n\n2- In the current directory, run the following code. It downloads the model in safetensors format, and converts it to '.bin' format that is compatible with diffusers.\n```shell\nbash CIVIT_AI/civit_ai.sh 130072\n```\n3- Copy the path of the converted model, `$CWD/CIVIT_AI/diffusers_models/[CUSTOMIZED MODEL]` (e.g. `CIVIT_AI/diffusers_models/realisticVisionV60B1_v51VAE` for 130072), and use the path in the config file.\n\n\n## Dataset\n\nDataset will be released soon.\n\n## Examples \n### Type of Edits\n\u003ctable\u003e\n\u003ctr\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/examples/glitter.gif\"\u003e\u003c/td\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/examples/watercolor-new.gif\"\u003e\u003c/td\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/examples/coast.gif\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n  \u003ctd width=33% style=\"text-align:center;\"\u003e1- Local Editing\u003c/td\u003e\n  \u003ctd width=33% style=\"text-align:center;\"\u003e2- Visual Style Editing\u003c/td\u003e\n  \u003ctd width=33% style=\"text-align:center;\"\u003e3- Background Editing\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n\u003ctable\u003e\n\u003ctr\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/examples/a_dinosaur.gif\"\u003e\u003c/td\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/examples/tractor.gif\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\n\u003ctr\u003e\n  \u003ctd width=50% style=\"text-align:center;\"\u003e4- Shape/Attribute Editing\u003c/td\u003e\n  \u003ctd width=50% style=\"text-align:center;\"\u003e5- Extreme Shape Editing\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n### Editing on Various Types of Motions\n\u003ctable\u003e\n\u003ctr\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/examples/crochet.gif\"\u003e\u003c/td\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/examples/anime.gif\"\u003e\u003c/td\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/examples/rave.gif\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n  \u003ctd width=33% style=\"text-align:center;\"\u003e1- Exo-motion\u003c/td\u003e\n  \u003ctd width=33% style=\"text-align:center;\"\u003e2- Ego-motion\u003c/td\u003e\n  \u003ctd width=33% style=\"text-align:center;\"\u003e3- Ego-exo motion\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n\u003ctable\u003e\n\u003ctr\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/examples/cheetah.gif\"\u003e\u003c/td\u003e\n  \u003ctd\u003e\u003cimg src=\"assets/examples/whales.gif\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\n\u003ctr\u003e\n  \u003ctd width=50% style=\"text-align:center;\"\u003e4- Occlusions\u003c/td\u003e\n  \u003ctd width=50% style=\"text-align:center;\"\u003e5- Multiple objects with appearance/disappearance\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n## Citation \n\n```\n@inproceedings{kara2024rave,\n  title={RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models},\n  author={Ozgur Kara and Bariscan Kurtkaya and Hidir Yesiltepe and James M. Rehg and Pinar Yanardag},\n  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},\n  year={2024}\n}\n\n``` \n\n## Maintenance\n\nThis is the official repository for **RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models**. Feel free to contact for any questions or discussions [Ozgur Kara](ozgurrkara99@gmail.com).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frehglab%2Frave","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frehglab%2Frave","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frehglab%2Frave/lists"}