{"id":29219741,"url":"https://github.com/dvlab-research/video-p2p","last_synced_at":"2025-07-03T02:06:52.489Z","repository":{"id":144081876,"uuid":"616034965","full_name":"dvlab-research/Video-P2P","owner":"dvlab-research","description":"Video-P2P: Video Editing with Cross-attention Control","archived":false,"fork":false,"pushed_at":"2024-03-12T13:31:27.000Z","size":5602,"stargazers_count":333,"open_issues_count":5,"forks_count":22,"subscribers_count":9,"default_branch":"main","last_synced_at":"2024-05-16T00:00:50.012Z","etag":null,"topics":["generative-model","image-editing","stable-diffusion","text-driven-editing","video-editing"],"latest_commit_sha":null,"homepage":"https://video-p2p.github.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dvlab-research.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-03-19T12:34:06.000Z","updated_at":"2024-07-20T23:24:23.326Z","dependencies_parsed_at":"2024-07-20T23:42:02.179Z","dependency_job_id":null,"html_url":"https://github.com/dvlab-research/Video-P2P","commit_stats":null,"previous_names":["dvlab-research/video-p2p"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/dvlab-research/Video-P2P","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dvlab-research%2FVideo-P2P","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dvlab-research%2FVideo-P2P/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dvlab-research%2FVideo-P2P/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dvlab-research%2FVideo-P2P/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dvlab-research","download_url":"https://codeload.github.com/dvlab-research/Video-P2P/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dvlab-research%2FVideo-P2P/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263245317,"owners_count":23436514,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["generative-model","image-editing","stable-diffusion","text-driven-editing","video-editing"],"created_at":"2025-07-03T02:06:48.134Z","updated_at":"2025-07-03T02:06:52.481Z","avatar_url":"https://github.com/dvlab-research.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Video-P2P: Video Editing with Cross-attention Control (CVPR 2024) \nThe official implementation of [Video-P2P](https://video-p2p.github.io/).\n\n[Shaoteng Liu](https://www.shaotengliu.com/), [Yuechen Zhang](https://julianjuaner.github.io/), [Wenbo Li](https://fenglinglwb.github.io/), [Zhe Lin](https://sites.google.com/site/zhelin625/), [Jiaya Jia](https://jiaya.me/)\n\n[![Project Website](https://img.shields.io/badge/Project-Website-orange)](https://video-p2p.github.io/)\n[![arXiv](https://img.shields.io/badge/arXiv-2303.04761-b31b1b.svg)](https://arxiv.org/abs/2303.04761)\n[![Hugging Face Demo](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/video-p2p-library/Video-P2P-Demo)\n\n![Teaser](./docs/teaser.png)\n\n## Changelog\n\n- 2023.03.20 Release Demo.\n- 2023.03.19 Release Code.\n- 2023.03.09 Paper preprint on arxiv.\n\n## Todo\n\n- [x] Release the code with 6 examples.\n- [x] Update a faster version.\n- [x] Release data.\n- [x] Release the Gradio Demo.\n- [x] Add local Gradio Demo.\n\n## Setup\n\n``` bash\nconda create --name vp2p python=3.9\nconda activate vp2p\npip install -r requirements.txt\n```\n\nThe code was tested on both Tesla V100 32GB and RTX3090 24GB. At least 20GB VRAM is required.\n\nThe environment is similar to [Tune-A-Video](https://github.com/showlab/Tune-A-Video) and [prompt-to-prompt](https://github.com/google/prompt-to-prompt/).\n\n[xformers](https://github.com/facebookresearch/xformers) on 3090 may meet this [issue](https://github.com/bryandlee/Tune-A-Video/issues/4).\n\n## Quickstart\n\nPlease replace **pretrained_model_path** with the path to your stable-diffusion.\n\nTo download the pre-trained model, please refer to [diffusers](https://github.com/huggingface/diffusers).\n\nPlease download [sd1.5](https://huggingface.co/runwayml/stable-diffusion-v1-5) and fill the path at this [line](https://github.com/dvlab-research/Video-P2P/blob/c916cf4b10e7a105cc78688b41210a1543a8d9c9/configs/rabbit-jump-tune.yaml#L1C48-L1C69).\n\n\n``` bash\n# Stage 1: Tuning to do model initialization.\n\n# You can minimize the tuning epochs to speed up.\npython run_tuning.py  --config=\"configs/rabbit-jump-tune.yaml\"\n```\n\n``` bash\n# Stage 2: Attention Control\n\n# We develop a faster mode (1 min on V100):\npython run_videop2p.py --config=\"configs/rabbit-jump-p2p.yaml\" --fast\n\n# The official mode (10 mins on V100, more stable):\npython run_videop2p.py --config=\"configs/rabbit-jump-p2p.yaml\"\n```\n\nFind your results in **Video-P2P/outputs/xxx/results**.\n\n## Dataset\n\nWe release our dataset [here](https://drive.google.com/drive/folders/1EN501LLVg4FPeZ39lEBYHdXrs7nc5wDb?usp=sharing).\n\nDownload them under **./data** and explore your creativity!\n\n## Results\n\n\u003ctable class=\"center\"\u003e\n\u003ctr\u003e\n  \u003ctd width=50% style=\"text-align:center;\"\u003econfigs/rabbit-jump-p2p.yaml\u003c/td\u003e\n  \u003ctd width=50% style=\"text-align:center;\"\u003econfigs/penguin-run-p2p.yaml\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n  \u003ctd\u003e\u003cimg src=\"https://video-p2p.github.io/assets/rabbit.gif\"\u003e\u003c/td\u003e\n  \u003ctd\u003e\u003cimg src=\"https://video-p2p.github.io/assets/penguin-crochet.gif\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n  \u003ctd width=50% style=\"text-align:center;\"\u003econfigs/man-motor-p2p.yaml\u003c/td\u003e\n  \u003ctd width=50% style=\"text-align:center;\"\u003econfigs/car-drive-p2p.yaml\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n  \u003ctd\u003e\u003cimg src=\"https://video-p2p.github.io/assets/motor.gif\"\u003e\u003c/td\u003e\n  \u003ctd\u003e\u003cimg src=\"https://video-p2p.github.io/assets/car.gif\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n  \u003ctd width=50% style=\"text-align:center;\"\u003econfigs/tiger-forest-p2p.yaml\u003c/td\u003e\n  \u003ctd width=50% style=\"text-align:center;\"\u003econfigs/bird-forest-p2p.yaml\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n  \u003ctd\u003e\u003cimg src=\"https://video-p2p.github.io/assets/tiger.gif\"\u003e\u003c/td\u003e\n  \u003ctd\u003e\u003cimg src=\"https://video-p2p.github.io/assets/bird-child.gif\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n## Gradio demo\nRunning the following command to launch the local demo built with [gradio](https://gradio.app/): \n``` bash\npython app_gradio.py\n```\nFind the demo on HuggingFace [here](https://huggingface.co/spaces/video-p2p-library/Video-P2P-Demo). The demo code borrows heavily from [Tune-A-Video](https://huggingface.co/spaces/Tune-A-Video-library/Tune-A-Video-Training-UI).\n\n## Citation \n```\n@misc{liu2023videop2p,\n      author={Liu, Shaoteng and Zhang, Yuechen and Li, Wenbo and Lin, Zhe and Jia, Jiaya},\n      title={Video-P2P: Video Editing with Cross-attention Control}, \n      journal={arXiv:2303.04761},\n      year={2023},\n}\n``` \n\n## References\n* prompt-to-prompt: https://github.com/google/prompt-to-prompt\n* Tune-A-Video: https://github.com/showlab/Tune-A-Video\n* diffusers: https://github.com/huggingface/diffusers\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdvlab-research%2Fvideo-p2p","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdvlab-research%2Fvideo-p2p","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdvlab-research%2Fvideo-p2p/lists"}