Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/open-mmlab/pia
[CVPR 2024] PIA, your Personalized Image Animator. Animate your images by text prompt, combing with Dreambooth, achieving stunning videos. PIA,你的个性化图像动画生成器,利用文本提示将图像变为奇妙的动画
https://github.com/open-mmlab/pia
aigc animation diffusion-models image-to-video image-to-video-generation personalized-generation stable-diffusion
Last synced: 5 days ago
JSON representation
[CVPR 2024] PIA, your Personalized Image Animator. Animate your images by text prompt, combing with Dreambooth, achieving stunning videos. PIA,你的个性化图像动画生成器,利用文本提示将图像变为奇妙的动画
- Host: GitHub
- URL: https://github.com/open-mmlab/pia
- Owner: open-mmlab
- License: apache-2.0
- Created: 2023-12-21T03:29:34.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-08-05T09:11:48.000Z (7 months ago)
- Last Synced: 2025-02-07T22:07:25.294Z (12 days ago)
- Topics: aigc, animation, diffusion-models, image-to-video, image-to-video-generation, personalized-generation, stable-diffusion
- Language: Python
- Homepage: https://pi-animator.github.io/
- Size: 79 MB
- Stars: 937
- Watchers: 24
- Forks: 75
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# CVPR 2024 | PIA:Personalized Image Animator
[**PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models**](https://arxiv.org/abs/2312.13964)
[Yiming Zhang*](https://github.com/ymzhang0319), [Zhening Xing*](https://github.com/LeoXing1996/), [Yanhong Zeng†](https://zengyh1900.github.io/), [Youqing Fang](https://github.com/FangYouqing), [Kai Chen†](https://chenkai.site/)
(*equal contribution, †corresponding Author)
[data:image/s3,"s3://crabby-images/d1d64/d1d6493bc2d2f5822481fadae05e8516cca459b8" alt="arXiv"](https://arxiv.org/abs/2312.13964)
[data:image/s3,"s3://crabby-images/78dde/78dde13a8e426a431adcba0ca4dd53317f5525e0" alt="Project Page"](https://pi-animator.github.io)
[data:image/s3,"s3://crabby-images/527d8/527d85dceda1b453ed861603b522263e8e8e87bb" alt="Open in OpenXLab"](https://openxlab.org.cn/apps/detail/zhangyiming/PiaPia)
[data:image/s3,"s3://crabby-images/e7985/e79852128a5f83c92496b9d734ca52d01e009a39" alt="Third Party Colab"](https://colab.research.google.com/github/camenduru/PIA-colab/blob/main/PIA_colab.ipynb)
[data:image/s3,"s3://crabby-images/b7356/b7356cb1aceb4ea2943e3f8ac714771442e2547a" alt="HuggingFace Model"](https://huggingface.co/Leoxing/PIA)
![]()
[data:image/s3,"s3://crabby-images/41191/41191cbdeeab70d94891d10d0c47d8a6ee5eaa78" alt="Replicate"](https://replicate.com/cjwbw/pia)PIA is a personalized image animation method which can generate videos with **high motion controllability** and **strong text and image alignment**.
If you find our project helpful, please give it a star :star: or [cite](#bibtex) it, we would be very grateful :sparkling_heart: .
## What's New
- [x] `2024/01/03` [Replicate Demo & API](https://replicate.com/cjwbw/pia) support!
- [x] `2024/01/03` [Colab](https://github.com/camenduru/PIA-colab) support from [camenduru](https://github.com/camenduru)!
- [x] `2023/12/28` Support `scaled_dot_product_attention` for 1024x1024 images with just 16GB of GPU memory.
- [x] `2023/12/25` HuggingFace demo is available now! [🤗 Hub](https://huggingface.co/spaces/Leoxing/PIA/)
- [x] `2023/12/22` Release the demo of PIA on [OpenXLab](https://openxlab.org.cn/apps/detail/zhangyiming/PiaPia) and checkpoints on [Google Drive](https://drive.google.com/file/d/1RL3Fp0Q6pMD8PbGPULYUnvjqyRQXGHwN/view?usp=drive_link) or [data:image/s3,"s3://crabby-images/2f28f/2f28f9bb181af8c7e3bf1d0d70a99eb9b2f118ef" alt="Open in OpenXLab"](https://openxlab.org.cn/models/detail/zhangyiming/PIA)## Setup
### Prepare EnvironmentUse the following command to install a conda environment for PIA from scratch:
```
conda env create -f pia.yml
conda activate pia
```
You may also want to install it based on an existing environment, then you can use `environment-pt2.yaml` for Pytorch==2.0.0. If you want to use lower version of Pytorch (e.g. 1.13.1), you can use the following command:```
conda env create -f environment.yaml
conda activate pia
```We strongly recommend you to use Pytorch==2.0.0 which supports `scaled_dot_product_attention` for memory-efficient image animation.
### Download checkpoints
```
conda install git-lfs
git lfs install
git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 models/StableDiffusion/
```
```
git clone https://huggingface.co/Leoxing/PIA models/PIA/
```
```
bash download_bashscripts/1-RealisticVision.sh
bash download_bashscripts/2-RcnzCartoon.sh
bash download_bashscripts/3-MajicMix.sh
```
You can also download *pia.ckpt* manually through link on [Google Drive](https://drive.google.com/file/d/1RL3Fp0Q6pMD8PbGPULYUnvjqyRQXGHwN/view?usp=drive_link)
or [HuggingFace](https://huggingface.co/Leoxing/PIA).
Put checkpoints as follows:
```
└── models
├── DreamBooth_LoRA
│ ├── ...
├── PIA
│ ├── pia.ckpt
└── StableDiffusion
├── vae
├── unet
└── ...
```
## Inference
### Image Animation
Image to Video result can be obtained by:
```
python inference.py --config=example/config/lighthouse.yaml
python inference.py --config=example/config/harry.yaml
python inference.py --config=example/config/majic_girl.yaml
```
Run the command above, then you can find the results in example/result:
Input Image
lightning, lighthouse
sun rising, lighthouse
fireworks, lighthouse
data:image/s3,"s3://crabby-images/ebe36/ebe367665eb45892aa0856f08d288e4f2c8f47de" alt=""
data:image/s3,"s3://crabby-images/13c95/13c95bdaad955333c205b8c2764bbbb2059c9497" alt=""
data:image/s3,"s3://crabby-images/60980/60980094a2252ef65419988de781d407b0b34c34" alt=""
data:image/s3,"s3://crabby-images/956d0/956d0cc676cab5d15174d5b39195ed53727dc4f5" alt=""
Input Image
1boy smiling
1boy playing the magic fire
1boy is waving hands
data:image/s3,"s3://crabby-images/bda96/bda96596b646478865c875c5d00ffacd65f79770" alt=""
data:image/s3,"s3://crabby-images/78113/78113afde993da831b99b3c4d881646f0332e023" alt=""
data:image/s3,"s3://crabby-images/1a13e/1a13ed331cc17f74bb3be6f2c2d3abe3d1b7ae5c" alt=""
data:image/s3,"s3://crabby-images/ce746/ce746513386d75823bfbd99719616a3ae95f99c8" alt=""
Input Image
1girl is smiling
1girl is crying
1girl, snowing
data:image/s3,"s3://crabby-images/757e5/757e5b1980dc6c65ac4a6c20f26841392c7f41c6" alt=""
data:image/s3,"s3://crabby-images/88327/88327534efb701175cb5a18e57687a0b4d517b38" alt=""
data:image/s3,"s3://crabby-images/2c112/2c11291c8345cbdbe014a528fc4af0d53c94496a" alt=""
data:image/s3,"s3://crabby-images/bd149/bd149d09e584b65d1372ae80149e56bfce5ac19d" alt=""
### Motion Magnitude
You can control the motion magnitude through the parameter **magnitude**:
```sh
python inference.py --config=example/config/xxx.yaml --magnitude=0 # Small Motion
python inference.py --config=example/config/xxx.yaml --magnitude=1 # Moderate Motion
python inference.py --config=example/config/xxx.yaml --magnitude=2 # Large Motion
```
Examples:
```sh
python inference.py --config=example/config/labrador.yaml
python inference.py --config=example/config/bear.yaml
python inference.py --config=example/config/genshin.yaml
```
Input Image
& Prompt
Small Motion
Moderate Motion
Large Motion
data:image/s3,"s3://crabby-images/9b104/9b104bd8fcfea361cf4b74fce00e727be9d57ac6" alt=""
data:image/s3,"s3://crabby-images/5b798/5b79809cf93a721ca62cae7d7acae3d25d6fd884" alt=""
data:image/s3,"s3://crabby-images/69986/69986d509112034c1e41d79b6453ea82827718bb" alt=""
data:image/s3,"s3://crabby-images/9a735/9a735613a1ef7805a1963fdf049fa4d403a4bbb3" alt=""
data:image/s3,"s3://crabby-images/1f4a5/1f4a5d35c8753957437d1a86a211f14be963d365" alt=""
data:image/s3,"s3://crabby-images/90670/90670b06b690761a8cb927950170e8fa1428ec66" alt=""
data:image/s3,"s3://crabby-images/5037e/5037ee821f2e79c511991aed1ad2379726e5a4d8" alt=""
data:image/s3,"s3://crabby-images/11661/11661a7e429e8e4930a63c7c7cbfe6e663489a00" alt=""
data:image/s3,"s3://crabby-images/a519b/a519b2efac626bc7a4906b479e5bef77251a8686" alt=""
data:image/s3,"s3://crabby-images/0694e/0694e36f4c3e94453212631180ecc59597beb850" alt=""
data:image/s3,"s3://crabby-images/f1933/f1933b27b903273cb01c87bd6a72828d08819821" alt=""
data:image/s3,"s3://crabby-images/1c846/1c8461486e01a929a9a4634d09e64ba0bd343128" alt=""
### Style Transfer
To achieve style transfer, you can run the command(*Please don't forget set the base model in xxx.yaml*):
Examples:
```sh
python inference.py --config example/config/concert.yaml --style_transfer
python inference.py --config example/config/anya.yaml --style_transfer
```
Input Image
& Base Model
1man is smiling
1man is crying
1man is singing
data:image/s3,"s3://crabby-images/8d36b/8d36b95390b8ef30e0f31c687e8fd2f3fe3589e1" alt=""
data:image/s3,"s3://crabby-images/000d1/000d1cfbac9efff4044111cca1fc1c1e34933ce6" alt=""
data:image/s3,"s3://crabby-images/4c737/4c737e19bb4730de6f0d7900402912541263d97f" alt=""
data:image/s3,"s3://crabby-images/d8756/d87566a39dc7cc74f876e599b358b5ce70a41f0d" alt=""
data:image/s3,"s3://crabby-images/8d36b/8d36b95390b8ef30e0f31c687e8fd2f3fe3589e1" alt=""
data:image/s3,"s3://crabby-images/b5347/b5347386ec83a1fab86e3da238a6c00b32d60ddb" alt=""
data:image/s3,"s3://crabby-images/19243/192433ff78741175904062f424012ba03bfdf1ed" alt=""
data:image/s3,"s3://crabby-images/5924a/5924ad606f76bdc44658f69f9419cf1479249c4f" alt=""
1girl smiling
1girl open mouth
1girl is crying, pout
data:image/s3,"s3://crabby-images/31eb0/31eb0fa0ba9e913fdfc47a6deedee81ca4c90460" alt=""
data:image/s3,"s3://crabby-images/ec1a2/ec1a21c1e817216e91ccd6d52eba65c032e9036d" alt=""
data:image/s3,"s3://crabby-images/11164/111643552f30148d7006bfeba69d9e125ced346f" alt=""
data:image/s3,"s3://crabby-images/09d7d/09d7db9dfb7b941d9a408a27cd32ba5bfa0d33bf" alt=""
### Loop Video
You can generate loop by using the parameter --loop
```sh
python inference.py --config=example/config/xxx.yaml --loop
```
Examples:
```sh
python inference.py --config=example/config/lighthouse.yaml --loop
python inference.py --config=example/config/labrador.yaml --loop
```
Input Image
lightning, lighthouse
sun rising, lighthouse
fireworks, lighthouse
data:image/s3,"s3://crabby-images/ebe36/ebe367665eb45892aa0856f08d288e4f2c8f47de" alt=""
data:image/s3,"s3://crabby-images/3cd26/3cd266908bffcbad10b2e36733fcd5f24ccf8922" alt=""
data:image/s3,"s3://crabby-images/fe0eb/fe0ebfaedaec7e2a2c72638e02a44f452b21db0d" alt=""
data:image/s3,"s3://crabby-images/f2c14/f2c14733161f1dd6d9fa590ed00ccbad0c65f92e" alt=""
Input Image
labrador jumping
labrador walking
labrador running
data:image/s3,"s3://crabby-images/9b104/9b104bd8fcfea361cf4b74fce00e727be9d57ac6" alt=""
data:image/s3,"s3://crabby-images/a9ffd/a9ffdc97c0ac043def2cdfb7e0800ec488bcc1aa" alt=""
data:image/s3,"s3://crabby-images/b57ed/b57ed660018f61a5163dde595ba5407baeb07ae5" alt=""
data:image/s3,"s3://crabby-images/e202d/e202d324e283502eddcf62d6c04e2c829195b3a6" alt=""
## Training
We provide [training script]("train.py") for PIA. It borrows from [AnimateDiff](https://github.com/guoyww/AnimateDiff/tree/main) heavily, so please prepare the dataset and configuration files according to the [guideline](https://github.com/guoyww/AnimateDiff/blob/main/__assets__/docs/animatediff.md#steps-for-training).
After preparation, you can train the model by running the following command using torchrun:
```shell
torchrun --nnodes=1 --nproc_per_node=1 train.py --config example/config/train.yaml
```
or by slurm,
```shell
srun --quotatype=reserved --job-name=pia --gres=gpu:8 --ntasks-per-node=8 --ntasks=8 --cpus-per-task=4 --kill-on-bad-exit=1 python train.py --config example/config/train.yaml
```
## AnimateBench
We have open-sourced AnimateBench on [HuggingFace](https://huggingface.co/datasets/ymzhang319/AnimateBench) which includes images, prompts and configs to evaluate PIA and other image animation methods.
## BibTex
```
@inproceedings{zhang2024pia,
title={Pia: Your personalized image animator via plug-and-play modules in text-to-image models},
author={Zhang, Yiming and Xing, Zhening and Zeng, Yanhong and Fang, Youqing and Chen, Kai},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={7747--7756},
year={2024}
}
```
## Contact Us
**Yiming Zhang**: [email protected]
**Zhening Xing**: [email protected]
**Yanhong Zeng**: [email protected]
## Acknowledgements
The code is built upon [AnimateDiff](https://github.com/guoyww/AnimateDiff), [Tune-a-Video](https://github.com/showlab/Tune-A-Video) and [PySceneDetect](https://github.com/Breakthrough/PySceneDetect)