{"id":29009907,"url":"https://github.com/tencentarc/videopainter","last_synced_at":"2025-06-25T15:33:41.777Z","repository":{"id":282178891,"uuid":"945243550","full_name":"TencentARC/VideoPainter","owner":"TencentARC","description":"Any-length Video Inpainting and Editing with Plug-and-Play Context Control","archived":false,"fork":false,"pushed_at":"2025-04-08T03:54:58.000Z","size":77173,"stargazers_count":306,"open_issues_count":8,"forks_count":18,"subscribers_count":8,"default_branch":"main","last_synced_at":"2025-04-08T04:28:43.724Z","etag":null,"topics":["video","video-dataset","video-editing","video-inpainting"],"latest_commit_sha":null,"homepage":"https://yxbian23.github.io/project/video-painter/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/TencentARC.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-09T01:21:24.000Z","updated_at":"2025-04-08T03:55:01.000Z","dependencies_parsed_at":"2025-04-01T04:12:35.363Z","dependency_job_id":null,"html_url":"https://github.com/TencentARC/VideoPainter","commit_stats":null,"previous_names":["tencentarc/videopainter"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/TencentARC/VideoPainter","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TencentARC%2FVideoPainter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TencentARC%2FVideoPainter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TencentARC%2FVideoPainter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TencentARC%2FVideoPainter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/TencentARC","download_url":"https://codeload.github.com/TencentARC/VideoPainter/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TencentARC%2FVideoPainter/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261901407,"owners_count":23227593,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["video","video-dataset","video-editing","video-inpainting"],"created_at":"2025-06-25T15:33:40.535Z","updated_at":"2025-06-25T15:33:41.741Z","avatar_url":"https://github.com/TencentARC.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# VideoPainter\n\n[SIGGRAPH 2025] Official code of the paper \"VideoPainter: Any-length Video Inpainting and Editing with Plug-and-Play Context Control\"\n\nKeywords: Video Inpainting, Video Editing, Video Generation\n\n\u003e [Yuxuan Bian](https://yxbian23.github.io/)\u003csup\u003e12\u003c/sup\u003e, [Zhaoyang Zhang](https://zzyfd.github.io/#/)\u003csup\u003e1‡\u003c/sup\u003e, [Xuan Ju](https://juxuan27.github.io/)\u003csup\u003e2\u003c/sup\u003e, [Mingdeng Cao](https://openreview.net/profile?id=~Mingdeng_Cao1)\u003csup\u003e3\u003c/sup\u003e, [Liangbin Xie](https://liangbinxie.github.io/)\u003csup\u003e4\u003c/sup\u003e, [Ying Shan](https://www.linkedin.com/in/YingShanProfile/)\u003csup\u003e1\u003c/sup\u003e, [Qiang Xu](https://cure-lab.github.io/)\u003csup\u003e2✉\u003c/sup\u003e\u003cbr\u003e\n\u003e \u003csup\u003e1\u003c/sup\u003eARC Lab, Tencent PCG \u003csup\u003e2\u003c/sup\u003eThe Chinese University of Hong Kong \u003csup\u003e3\u003c/sup\u003eThe University of Tokyo \u003csup\u003e4\u003c/sup\u003eUniversity of Macau \u003csup\u003e‡\u003c/sup\u003eProject Lead \u003csup\u003e✉\u003c/sup\u003eCorresponding Author\n\n\n\n\u003cp align=\"center\"\u003e\n\u003ca href='https://yxbian23.github.io/project/video-painter'\u003e\u003cimg src='https://img.shields.io/badge/Project-Page-Green'\u003e\u003c/a\u003e \u0026nbsp;\n\u003ca href=\"https://arxiv.org/abs/2503.05639\"\u003e\u003cimg src=\"https://img.shields.io/badge/arXiv-2503.05639-b31b1b.svg\"\u003e\u003c/a\u003e \u0026nbsp;\n\u003ca href=\"https://github.com/TencentARC/VideoPainter\"\u003e\u003cimg src=\"https://img.shields.io/badge/GitHub-Code-black?logo=github\"\u003e\u003c/a\u003e \u0026nbsp;\n\u003ca href=\"https://youtu.be/HYzNfsD3A0s\"\u003e\u003cimg src=\"https://img.shields.io/badge/YouTube-Video-red?logo=youtube\"\u003e\u003c/a\u003e \u0026nbsp;\n\u003ca href='https://huggingface.co/datasets/TencentARC/VPData'\u003e\u003cimg src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Dataset-blue'\u003e\u003c/a\u003e \u0026nbsp;\n\u003ca href='https://huggingface.co/datasets/TencentARC/VPBench'\u003e\u003cimg src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Benchmark-blue'\u003e\u003c/a\u003e \u0026nbsp;\n\u003ca href=\"https://huggingface.co/TencentARC/VideoPainter\"\u003e\u003cimg src=\"https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n**Your star means a lot for us to develop this project!** ⭐⭐⭐\n\n**VPData and VPBench have been fully uploaded (contain 390K mask sequences and video captions). Welcome to use our biggest video segmentation dataset VPData with video captions!** 🔥🔥🔥 \n\n\n**📖 Table of Contents**\n\n\n- [VideoPainter](#videopainter)\n  - [🔥 Update Log](#-update-log)\n  - [📌 TODO](#todo)\n  - [🛠️ Method Overview](#️-method-overview)\n  - [🚀 Getting Started](#-getting-started)\n    - [Environment Requirement 🌍](#environment-requirement-)\n    - [Data Download ⬇️](#data-download-️)\n  - [🏃🏼 Running Scripts](#-running-scripts)\n    - [Training 🤯](#training-)\n    - [Inference 📜](#inference-)\n    - [Evaluation 📏](#evaluation-)\n  - [🤝🏼 Cite Us](#-cite-us)\n  - [💖 Acknowledgement](#-acknowledgement)\n\n\n\n## 🔥 Update Log\n- [2025/3/09] 📢 📢  [VideoPainter](https://huggingface.co/TencentARC/VideoPainter) are released, an efficient, any-length video inpainting \u0026 editing framework with plug-and-play context control.\n- [2025/3/09] 📢 📢  [VPData](https://huggingface.co/datasets/TencentARC/VPData) and [VPBench](https://huggingface.co/datasets/TencentARC/VPBench) are released, the largest video inpainting dataset with precise segmentation masks and dense video captions (\u003e390K clips).\n- [2025/3/25] 📢 📢  The 390K+ high-quality video segmentation masks of [VPData](https://huggingface.co/datasets/TencentARC/VPData) have been fully released.\n- [2025/3/25] 📢 📢  The raw videos of videovo subset have been uploaded to [VPData](https://huggingface.co/datasets/TencentARC/VPData), to solve the raw video link expiration issue.\n- [2025/4/08] 📢 📢  VideoPainter has been accepted by [SIGGRAPH 2025](https://s2025.siggraph.org/)!\n\n## TODO\n\n- [x] Release trainig and inference code\n- [x] Release evaluation code\n- [x] Release [VideoPainter checkpoints](https://huggingface.co/TencentARC/VideoPainter) (based on CogVideoX-5B)\n- [x] Release [VPData and VPBench](https://huggingface.co/collections/TencentARC/videopainter-67cc49c6146a48a2ba93d159) for large-scale training and evaluation.\n- [x] Release gradio demo\n- [ ] Data preprocessing code\n## 🛠️ Method Overview\n\nWe propose a novel dual-stream paradigm VideoPainter that incorporates an efficient context encoder (comprising only 6\\% of the backbone parameters) to process masked videos and inject backbone-aware background contextual cues to any pre-trained video DiT, producing semantically consistent content in a plug-and-play manner. This architectural separation significantly reduces the model's learning complexity while enabling nuanced integration of crucial background context. We also introduce a novel target region ID resampling technique that enables any-length video inpainting, greatly enhancing our practical applicability. Additionally, we establish a scalable dataset pipeline leveraging current vision understanding models, contributing VPData and VPBench to facilitate segmentation-based inpainting training and assessment, the largest video inpainting dataset and benchmark to date with over 390K diverse clips. Using inpainting as a pipeline basis, we also explore downstream applications including video editing and video editing pair data generation, demonstrating competitive performance and significant practical potential. \n![](assets/teaser.jpg)\n\n\n\n## 🚀 Getting Started\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eEnvironment Requirement 🌍\u003c/b\u003e\u003c/summary\u003e\n\n\nClone the repo:\n\n```\ngit clone https://github.com/TencentARC/VideoPainter.git\n```\n\nWe recommend you first use `conda` to create virtual environment, and install needed libraries. For example:\n\n\n```\nconda create -n videopainter python=3.10 -y\nconda activate videopainter\npip install -r requirements.txt\n```\n\nThen, you can install diffusers (implemented in this repo) with:\n\n```\ncd ./diffusers\npip install -e .\n```\n\nAfter that, you can install required ffmpeg thourgh:\n\n```\nconda install -c conda-forge ffmpeg -y\n```\n\nOptional, you can install sam2 for gradio demo thourgh:\n\n```\ncd ./app\npip install -e .\n```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eVPBench and VPData Download ⬇️\u003c/b\u003e\u003c/summary\u003e\n\nYou can download the VPBench [here](https://huggingface.co/datasets/TencentARC/VPBench), and the VPData [here](https://huggingface.co/datasets/TencentARC/VPData) (as well as the Davis we re-processed), which are used for training and testing the BrushNet. By downloading the data, you are agreeing to the terms and conditions of the license. The data structure should be like:\n\n```\n|-- data\n    |-- davis\n        |-- JPEGImages_432_240\n        |-- test_masks\n        |-- davis_caption\n        |-- test.json\n        |-- train.json\n    |-- videovo/raw_video\n        |-- 000005000\n            |-- 000005000000.0.mp4\n            |-- 000005000001.0.mp4\n            |-- ...\n        |-- 000005001\n        |-- ...\n    |-- pexels/pexels/raw_video\n        |-- 000000000\n            |-- 000000000000_852038.mp4\n            |-- 000000000001_852057.mp4\n            |-- ...\n        |-- 000000001\n        |-- ...\n    |-- video_inpainting\n        |-- videovo\n            |-- 000005000000/all_masks.npz\n            |-- 000005000001/all_masks.npz\n            |-- ...\n        |-- pexels\n            |-- ...\n    |-- pexels_videovo_train_dataset.csv\n    |-- pexels_videovo_val_dataset.csv\n    |-- pexels_videovo_test_dataset.csv\n    |-- our_video_inpaint.csv\n    |-- our_video_inpaint_long.csv\n    |-- our_video_edit.csv\n    |-- our_video_edit_long.csv\n    |-- pexels.csv\n    |-- videovo.csv\n    \n```\n\nYou can download the VPBench, and put the benchmark to the `data` folder by:\n```\ngit lfs install\ngit clone https://huggingface.co/datasets/TencentARC/VPBench\nmv VPBench data\ncd data\nunzip pexels.zip\nunzip videovo.zip\nunzip davis.zip\nunzip video_inpainting.zip\n```\n\nYou can download the VPData (only mask and text annotations due to the space limit), and put the dataset to the `data` folder by:\n```\ngit lfs install\ngit clone https://huggingface.co/datasets/TencentARC/VPData\nmv VPBench data\n\n# 1. unzip the masks in VPData\npython data_utils/unzip_folder.py --source_dir ./data/videovo_masks --target_dir ./data/video_inpainting/videovo\npython data_utils/unzip_folder.py --source_dir ./data/pexels_masks --target_dir ./data/video_inpainting/pexels\n\n# 2. unzip the raw videos in Videovo subset in VPData\npython data_utils/unzip_folder.py --source_dir ./data/videovo_raw_videos --target_dir ./data/videovo/raw_video\n```\n\nNoted: *Due to the space limit, you need to run the following script to download the raw videos of the Pexels subset in VPData. The format should be consistent with VPData/VPBench above (After download the VPData/VPBench, the script will automatically place the raw videos of VPData into the corresponding dataset directories that have been created by VPBench).*\n\n```\ncd data_utils\npython VPData_download.py\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eCheckpoints\u003c/b\u003e\u003c/summary\u003e\n\nCheckpoints of VideoPainter can be downloaded from [here](https://huggingface.co/TencentARC/VideoPainter). The ckpt folder contains \n\n- VideoPainter pretrained checkpoints for CogVideoX-5b-I2V \n- VideoPainter IP Adapter pretrained checkpoints for CogVideoX-5b-I2V \n- pretrinaed CogVideoX-5b-I2V checkpoint from [HuggingFace](https://huggingface.co/THUDM/CogVideoX-5b-I2V). \n\nYou can download the checkpoints, and put the checkpoints to the `ckpt` folder by:\n```\ngit lfs install\ngit clone https://huggingface.co/TencentARC/VideoPainter\nmv VideoPainter ckpt\n```\n\nYou also need to download the base model [CogVideoX-5B-I2V](https://huggingface.co/THUDM/CogVideoX-5b-I2V) by:\n```\ngit lfs install\ncd ckpt\ngit clone https://huggingface.co/THUDM/CogVideoX-5b-I2V\n```\n\n[Optional]You need to download [FLUX.1-Fill-dev](https://huggingface.co/black-forest-labs/FLUX.1-Fill-dev/) for first frame inpainting:\n```\ngit lfs install\ncd ckpt\ngit clone https://huggingface.co/black-forest-labs/FLUX.1-Fill-dev\nmv ckpt/FLUX.1-Fill-dev ckpt/flux_inp\n```\n\n[Optional]You need to download [SAM2](https://huggingface.co/facebook/sam2-hiera-large) for video segmentation in gradio demo:\n```\ngit lfs install\ncd ckpt\nwget https://huggingface.co/facebook/sam2-hiera-large/resolve/main/sam2_hiera_large.pt\n```\nYou can also choose the segmentation checkpoints of other sizes to balance efficiency and performance, such as [SAM2-Tiny](https://huggingface.co/facebook/sam2-hiera-tiny).\n\nThe ckpt structure should be like:\n\n```\n|-- ckpt\n    |-- VideoPainter/checkpoints\n        |-- branch\n            |-- config.json\n            |-- diffusion_pytorch_model.safetensors\n    |-- VideoPainterID/checkpoints\n        |-- pytorch_lora_weights.safetensors\n    |-- CogVideoX-5b-I2V\n        |-- scheduler\n        |-- transformer\n        |-- vae\n        |-- ...\n    |-- flux_inp\n        |-- scheduler\n        |-- transformer\n        |-- vae\n        |-- ...\n    |-- sam2_hiera_large.pt\n```\n\u003c/details\u003e\n\n## 🏃🏼 Running Scripts\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eTraining 🤯\u003c/b\u003e\u003c/summary\u003e\n\nYou can train the VideoPainter using the script:\n\n```\n# cd train\n# bash VideoPainter.sh\n\nexport MODEL_PATH=\"../ckpt/CogVideoX-5b-I2V\"\nexport CACHE_PATH=\"~/.cache\"\nexport DATASET_PATH=\"../data/videovo/raw_video\"\nexport PROJECT_NAME=\"pexels_videovo-inpainting\"\nexport RUNS_NAME=\"VideoPainter\"\nexport OUTPUT_PATH=\"./${PROJECT_NAME}/${RUNS_NAME}\"\nexport PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True\nexport TOKENIZERS_PARALLELISM=false\nexport CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7\n\naccelerate launch --config_file accelerate_config_machine_single_ds.yaml  --machine_rank 0 \\\n  train_cogvideox_inpainting_i2v_video.py \\\n  --pretrained_model_name_or_path $MODEL_PATH \\\n  --cache_dir $CACHE_PATH \\\n  --meta_file_path ../data/pexels_videovo_train_dataset.csv \\\n  --val_meta_file_path ../data/pexels_videovo_val_dataset.csv \\\n  --instance_data_root $DATASET_PATH \\\n  --dataloader_num_workers 1 \\\n  --num_validation_videos 1 \\\n  --validation_epochs 1 \\\n  --seed 42 \\\n  --mixed_precision bf16 \\\n  --output_dir $OUTPUT_PATH \\\n  --height 480 \\\n  --width 720 \\\n  --fps 8 \\\n  --max_num_frames 49 \\\n  --video_reshape_mode \"resize\" \\\n  --skip_frames_start 0 \\\n  --skip_frames_end 0 \\\n  --max_text_seq_length 226 \\\n  --branch_layer_num 2 \\\n  --train_batch_size 1 \\\n  --num_train_epochs 10 \\\n  --checkpointing_steps 1024 \\\n  --validating_steps 256 \\\n  --gradient_accumulation_steps 1 \\\n  --learning_rate 1e-5 \\\n  --lr_scheduler cosine_with_restarts \\\n  --lr_warmup_steps 1000 \\\n  --lr_num_cycles 1 \\\n  --enable_slicing \\\n  --enable_tiling \\\n  --noised_image_dropout 0.05 \\\n  --gradient_checkpointing \\\n  --optimizer AdamW \\\n  --adam_beta1 0.9 \\\n  --adam_beta2 0.95 \\\n  --max_grad_norm 1.0 \\\n  --allow_tf32 \\\n  --report_to wandb \\\n  --tracker_name $PROJECT_NAME \\\n  --runs_name $RUNS_NAME \\\n  --inpainting_loss_weight 1.0 \\\n  --mix_train_ratio 0 \\\n  --first_frame_gt \\\n  --mask_add \\\n  --mask_transform_prob 0.3 \\\n  --p_brush 0.4 \\\n  --p_rect 0.1 \\\n  --p_ellipse 0.1 \\\n  --p_circle 0.1 \\\n  --p_random_brush 0.3\n\n# cd train\n# bash VideoPainterID.sh\nexport MODEL_PATH=\"../ckpt/CogVideoX-5b-I2V\"\nexport BRANCH_MODEL_PATH=\"../ckpt/VideoPainter/checkpoints/branch\"\nexport CACHE_PATH=\"~/.cache\"\nexport DATASET_PATH=\"../data/videovo/raw_video\"\nexport PROJECT_NAME=\"pexels_videovo-inpainting\"\nexport RUNS_NAME=\"VideoPainterID\"\nexport OUTPUT_PATH=\"./${PROJECT_NAME}/${RUNS_NAME}\"\nexport PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True\nexport TOKENIZERS_PARALLELISM=false\nexport CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7\n\naccelerate launch --config_file accelerate_config_machine_single_ds_wo_cpu.yaml --machine_rank 0 \\\n  train_cogvideox_inpainting_i2v_video_resample.py \\\n  --pretrained_model_name_or_path $MODEL_PATH \\\n  --cogvideox_branch_name_or_path $BRANCH_MODEL_PATH \\\n  --cache_dir $CACHE_PATH \\\n  --meta_file_path ../data/pexels_videovo_train_dataset.csv \\\n  --val_meta_file_path ../data/pexels_videovo_val_dataset.csv \\\n  --instance_data_root $DATASET_PATH \\\n  --dataloader_num_workers 1 \\\n  --num_validation_videos 1 \\\n  --validation_epochs 1 \\\n  --seed 42 \\\n  --rank 256 \\\n  --lora_alpha 128 \\\n  --mixed_precision bf16 \\\n  --output_dir $OUTPUT_PATH \\\n  --height 480 \\\n  --width 720 \\\n  --fps 8 \\\n  --max_num_frames 49 \\\n  --video_reshape_mode \"resize\" \\\n  --skip_frames_start 0 \\\n  --skip_frames_end 0 \\\n  --max_text_seq_length 226 \\\n  --branch_layer_num 2 \\\n  --train_batch_size 1 \\\n  --num_train_epochs 10 \\\n  --checkpointing_steps 256 \\\n  --validating_steps 128 \\\n  --gradient_accumulation_steps 1 \\\n  --learning_rate 5e-5 \\\n  --lr_scheduler cosine_with_restarts \\\n  --lr_warmup_steps 200 \\\n  --lr_num_cycles 1 \\\n  --enable_slicing \\\n  --enable_tiling \\\n  --noised_image_dropout 0.05 \\\n  --gradient_checkpointing \\\n  --optimizer AdamW \\\n  --adam_beta1 0.9 \\\n  --adam_beta2 0.95 \\\n  --max_grad_norm 1.0 \\\n  --allow_tf32 \\\n  --report_to wandb \\\n  --tracker_name $PROJECT_NAME \\\n  --runs_name $RUNS_NAME \\\n  --inpainting_loss_weight 1.0 \\\n  --mix_train_ratio 0 \\\n  --first_frame_gt \\\n  --mask_add \\\n  --mask_transform_prob 0.3 \\\n  --p_brush 0.4 \\\n  --p_rect 0.1 \\\n  --p_ellipse 0.1 \\\n  --p_circle 0.1 \\\n  --p_random_brush 0.3 \\\n  --id_pool_resample_learnable\n```\n\u003c/details\u003e\n\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eInference 📜\u003c/b\u003e\u003c/summary\u003e\n\nYou can inference for the video inpainting or editing with the script:\n\n```\ncd infer\n# video inpainting\nbash inpaint.sh\n# video inpainting with ID resampling\nbash inpaint_id_resample.sh\n# video editing\nbash edit.sh\n```\n\nOur VideoPainter can also function as a video editing pair data generator, you can inference with the script:\n```\nbash edit_bench.sh\n```\n\nSince VideoPainter is trained on public Internet videos, it primarily performs well on general scenarios. For high-quality industrial applications (e.g., product exhibitions, virtual try-on), we recommend training the model on your domain-specific data. We welcome and appreciate any contributions of trained models from the community!\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eGradio Demo 🖌️\u003c/b\u003e\u003c/summary\u003e\n\nYou can also inference through gradio demo:\n\n```\n# cd app\nCUDA_VISIBLE_DEVICES=0 python app.py \\\n    --model_path ../ckpt/CogVideoX-5b-I2V \\\n    --inpainting_branch ../ckpt/VideoPainter/checkpoints/branch \\\n    --id_adapter ../ckpt/VideoPainterID/checkpoints \\\n    --img_inpainting_model ../ckpt/flux_inp\n```\n\u003c/details\u003e\n\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eEvaluation 📏\u003c/b\u003e\u003c/summary\u003e\n\nYou can evaluate using the script:\n\n```\ncd evaluate\n# video inpainting\nbash eval_inpainting.sh\n# video inpainting with ID resampling\nbash eval_inpainting_id_resample.sh\n# video editing\nbash eval_edit.sh\n# video editing with ID resampling\nbash eval_editing_id_resample.sh\n```\n\u003c/details\u003e\n\n## 🤝🏼 Cite Us\n\n```\n@article{bian2025videopainter,\n  title={VideoPainter: Any-length Video Inpainting and Editing with Plug-and-Play Context Control},\n  author={Bian, Yuxuan and Zhang, Zhaoyang and Ju, Xuan and Cao, Mingdeng and Xie, Liangbin and Shan, Ying and Xu, Qiang},\n  journal={arXiv preprint arXiv:2503.05639},\n  year={2025}\n}\n```\n\n\n## 💖 Acknowledgement\n\u003cspan id=\"acknowledgement\"\u003e\u003c/span\u003e\n\nOur code is modified based on [diffusers](https://github.com/huggingface/diffusers) and [CogVideoX](https://github.com/THUDM/CogVideo), thanks to all the contributors!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftencentarc%2Fvideopainter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftencentarc%2Fvideopainter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftencentarc%2Fvideopainter/lists"}