https://github.com/pku-yuangroup/chronomagic-bench
[NeurIPS 2024 D&B Spotlight🔥] ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
https://github.com/pku-yuangroup/chronomagic-bench
aigc benchmark dataset diffusion-models evaluation evaluation-kit gen-ai metamorphic-video-generation open-sora-plan text-to-video time-lapse time-lapse-dataset video-generation
Last synced: 6 months ago
JSON representation
[NeurIPS 2024 D&B Spotlight🔥] ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
- Host: GitHub
- URL: https://github.com/pku-yuangroup/chronomagic-bench
- Owner: PKU-YuanGroup
- License: apache-2.0
- Created: 2024-06-25T03:09:39.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-09T14:11:14.000Z (8 months ago)
- Last Synced: 2025-03-28T16:04:20.529Z (6 months ago)
- Topics: aigc, benchmark, dataset, diffusion-models, evaluation, evaluation-kit, gen-ai, metamorphic-video-generation, open-sora-plan, text-to-video, time-lapse, time-lapse-dataset, video-generation
- Language: Python
- Homepage: https://pku-yuangroup.github.io/ChronoMagic-Bench/
- Size: 13.3 MB
- Stars: 195
- Watchers: 2
- Forks: 14
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
![]()
[NeurIPS D&B 2024 Spotlight] ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
If you like our project, please give us a star ⭐ on GitHub for the latest update.
[](https://huggingface.co/spaces/BestWishYsh/ChronoMagic-Bench)
[](https://huggingface.co/papers/2406.18522)
[](https://arxiv.org/abs/2406.18522)
[](https://pku-yuangroup.github.io/ChronoMagic-Bench/)
[](https://huggingface.co/datasets/BestWishYsh/ChronoMagic-Pro)
[](https://huggingface.co/datasets/BestWishYsh/ChronoMagic-ProH)
[](https://huggingface.co/datasets/BestWishYsh/ChronoMagic-Bench/tree/main/Results)
[](https://twitter.com/AdeenaY8/status/1806409038743171191)
[](https://twitter.com/vhjf36495872/status/1806151450441159024?s=61&t=lLg2j2-sZ9igea_Cj3ToLw)
[](https://github.com/PKU-YuanGroup/ChronoMagic-Bench/blob/main/LICENSE)
[](https://github.com/PKU-YuanGroup/ChronoMagic-Bench/)
This repository is the official implementation of ChronoMagic-Bench, a benchmark for metamorphic evaluation of text-to-time-lapse video generation. The key insight is to evaluate the capabilities of Text-to-Video Generation Models in physics, biology, and chemistry by enabling the generation of time-lapse videos, which are characterized by rich physics priors, through a free-form text prompt.
💡 We also have other video generation projects that may interest you ✨.
> [**Open-Sora Plan: Open-Source Large Video Generation Model**](https://arxiv.org/abs/2412.00131)
> Bin Lin, Yunyang Ge and Xinhua Cheng etc.
[](https://github.com/PKU-YuanGroup/Open-Sora-Plan) [](https://github.com/PKU-YuanGroup/Open-Sora-Plan) [](https://arxiv.org/abs/2412.00131)
>
> [**MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators**](https://arxiv.org/abs/2404.05014)
> Shenghai Yuan, Jinfa Huang and Yujun Shi etc.
> [](https://github.com/PKU-YuanGroup/MagicTime) [](https://github.com/PKU-YuanGroup/MagicTime) [](https://arxiv.org/abs/2404.05014)
>
> [**ConsisID: Identity-Preserving Text-to-Video Generation by Frequency Decomposition**](https://arxiv.org/abs/2411.17440)
> Shenghai Yuan, Jinfa Huang and Xianyi He etc.
> [](https://github.com/PKU-YuanGroup/ConsisID/) [](https://github.com/PKU-YuanGroup/ConsisID/) [](https://arxiv.org/abs/2411.17440)
>## 📣 News
* ⏳⏳⏳ Evaluate more Text-to-Video Generation Models via *ChronoMagic-Bench*.
* `[2024.12.31]` 🔥 We further evaluate the widely popular [Sora](https://openai.com/sora/), which struggles to generate a diverse range of time-lapse videos effectively. This suggests that the generation of high-quality metamorphic videos remains an area in need of further exploration. The results are available [here](https://huggingface.co/datasets/BestWishYsh/ChronoMagic-Bench/tree/main/Results/Closed_Source).
* `[2024.09.30]` 🔥 We have updated the calculation of the CHScore, making it more robust to temporally coherent disappearance of points. You can click [here](https://github.com/PKU-YuanGroup/ChronoMagic-Bench/tree/main/CHScore) for detailed implementation.
* `[2024.09.26]` ✨ Our paper is accepted by **NeurIPS 2024 D&B track** as a **spotlight** present.
* `[2024.08.13]` 🔥 We further evaluate [EasyAnimate-V3](https://github.com/aigc-apps/EasyAnimate) and [CogVideoX-2B](https://github.com/THUDM/CogVideo). The results are available [here](https://huggingface.co/spaces/BestWishYsh/ChronoMagic-Bench).
* `[2024.06.30]` 🔥 We release the code of the **"Multi-Aspect Data Preprocessing"**, which is used to process the *ChronoMagic-Pro* dataset. Please click [here](https://github.com/PKU-YuanGroup/ChronoMagic-Bench/tree/main/Multi-Aspect_Preprocessing) and [here](https://huggingface.co/papers/2406.18522) to see more details.
* `[2024.06.29]` 🔥 Support evaluating customized Text-to-Video models. The code and instructions are available in this repo.
* `[2024.06.28]` 🔥 We release the **ChronoMagic-Pro** and **ChronoMagic-ProH** datasets. The datasets include **460K** and **150K** time-lapse video-text pairs respectively and can be downloaded at [HF-Dataset-Pro](https://huggingface.co/datasets/BestWishYsh/ChronoMagic-Pro) and [HF-Dataset-ProH](https://huggingface.co/datasets/BestWishYsh/ChronoMagic-ProH).
* `[2024.06.27]` 🔥 We release the **arXiv paper** and **Leaderboard** for *ChronoMagic-Bench*, and you can click [here](https://arxiv.org/abs/2406.18522) to read the paper and [here](https://huggingface.co/spaces/BestWishYsh/ChronoMagic-Bench) to see the leaderboard.
* `[2024.06.26]` 🔥 We release the **testing prompts**, **reference videos** and **generated results** by different models in *ChronoMagic-Bench*, and you can click [here](https://huggingface.co/datasets/BestWishYsh/ChronoMagic-Bench) to see more details.
* `[2024.06.25]` 🔥 **All codes & datasets** are coming soon! Stay tuned 👀!## 😮 Highlights
*ChronoMagic-Bench* can reflect the **physical prior capacity** of Text-to-Video Generation Model.
#### Resources
* [ChronoMagic-Bench](https://huggingface.co/datasets/BestWishYsh/ChronoMagic-Bench/tree/main): including 1649 time-lapse video-text pairs. (captioned by GPT-4o)
* [ChronoMagic-Bench-150](https://huggingface.co/datasets/BestWishYsh/ChronoMagic-Bench/tree/main): including 150 time-lapse video-text pairs. (captioned by GPT-4o)
* [ChronoMagic](https://huggingface.co/datasets/BestWishYsh/ChronoMagic): including 2265 time-lapse video-text pairs. (captioned by GPT-4V)
* [ChronoMagic-Pro](https://huggingface.co/datasets/BestWishYsh/ChronoMagic-Pro): including 460K time-lapse video-text pairs. (captioned by ShareGPT4Video)
* [ChronoMagic-ProH](https://huggingface.co/datasets/BestWishYsh/ChronoMagic-ProH): including 150K time-lapse video-text pairs. (captioned by ShareGPT4Video)### :mega: Overview
In contrast to existing benchmarks, **ChronoMagic-Bench** emphasizes generating videos with high persistence and strong variation, i.e., metamorphic time-lapse videos with high physical prior content.
Backbone
Type
Visual Quality
Text Relevance
Metamorphic Amplitude
Temporal Coherence
UCF-101
General
✔️
✔️
❌
❌
Make-a-Video-Eval
General
✔️
✔️
❌
❌
MSR-VTT
General
✔️
✔️
❌
❌
FETV
General
✔️
✔️
❌
✔️
VBench
General
✔️
✔️
❌
✔️
T2VScore
General
✔️
✔️
❌
❌
ChronoMagic-Bench
Time-lapse
✔️
✔️
✔️
✔️
We specifically design **four major categories** for time-lapse videos *(as shown below)*, including *biological*, *human-created*, *meteorological*, and *physical* videos, and extend these to **75 subcategories**. Based on this, we construct **ChronoMagic-Bench**, comprising 1,649 prompts and their corresponding reference time-lapse videos.
Biological
Human Created
Meteorological
Physical
"Time-lapse of microgreens germinating and growing ..."
"Time-lapse of a modern house being constructed in ..."
"Time-lapse of a beach sunset capturing the sun's ..."
"Time-lapse of an ice cube melting on a solid ..."
"Time-lapse of microgreens germinating and growing ..."
"Time-lapse of a 3D printing process: starting with ..."
"Time-lapse of a solar eclipse showing the moon's ..."
"Time-lapse of a cake baking in an oven, depicting ..."
"Time-lapse of a butterfly metamorphosis from ..."
"Time-lapse of a busy nighttime city intersection ..."
"Time-lapse of a landscape transitioning from a ..."
"Time-lapse of a strawberry rotting: starting with ..."
### :mortar_board: Evaluation Results
![]()
We visualize the evaluation results of various open-source and closed-source T2V generation models across ChronoMagic-Bench.
#### :trophy: Leaderboard
See numeric values at our [Leaderboard](https://huggingface.co/spaces/BestWishYsh/ChronoMagic-Bench) :1st_place_medal::2nd_place_medal::3rd_place_medal:
or you can run it locally:
```bash
cd LeadBoard
python app.py
```## ⚙️ Requirements and Installation
We recommend the requirements as follows.
### Environment
```bash
git clone --depth=1 https://github.com/PKU-YuanGroup/ChronoMagic-Bench.git
cd ChronoMagic-Bench
conda create -n chronomagic python=3.10
conda activate chronomagic# install base packages
pip install -r requirements.txt# install flash-attn
git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention/csrc/layer_norm && pip install .
cd ../../../
rm -r flash-attention
```### Download Checkpoints
```bash
huggingface-cli download --repo-type model \
BestWishYsh/ChronoMagic-Bench \
--local-dir BestWishYsh/ChronoMagic-Bench
```## :bookmark_tabs: Benchmark Prompts
We provide *evaluation prompt lists* of *ChronoMagic-Bench* [here](https://huggingface.co/datasets/BestWishYsh/ChronoMagic-Bench/tree/main/Captions) or [here](https://github.com/PKU-YuanGroup/ChronoMagic-Bench/tree/main/prompts). You can use this to sample videos for evaluation of your model. We also provide the *reference videos* for the corresponding evaluation prompts [here](https://huggingface.co/datasets/BestWishYsh/ChronoMagic-Bench/tree/main).
## :hammer: Usage
Use *ChronoMagic-Bench* to evaluate videos, and video generative models.
### Prepare Videos for Evaluation
The generated videos should be named corresponding to the prompt ID in ChronoMagic-Bench and placed in the evaluation folder, which is structured as follows. We also provide input examples in the ['toy_video'](https://github.com/PKU-YuanGroup/ChronoMagic-Bench/tree/main/toy_video) .
```
# for open-source models
`-- input_video_folder
`-- model_name_a
|-- 1
| |-- 3d_printing_08.mp4
| `-- ...
|-- 2
| |-- 3d_printing_08.mp4
| `-- ...
`-- 3
|-- 3d_printing_08.mp4
`-- ...
`-- model_name_b
|-- 1
| |-- 3d_printing_08.mp4
| `-- ...
|-- 2
| |-- 3d_printing_08.mp4
| `-- ...
`-- 3
|-- 3d_printing_08.mp4
`-- ...
# for close-source models
-- input_video_folder
|-- model_name_a
| |-- 3d_printing_08.mp4
| `-- animal_04.mp4
| `-- ...
|-- model_name_b
| |-- 3d_printing_08.mp4
| `-- ...
`-- ...
```The filenames of all videos to be evaluated should be "videoid.mp4". For example, if the videoid is 3d_printing_08, the video filename should be "3d_printing_08.mp4". If this naming convention is not followed, the text relevance cannot be evaluated.
### Get MTScore, CHScore and GPT4o-MTScore
We provide output examples in the ['results'](https://github.com/PKU-YuanGroup/ChronoMagic-Bench/tree/main/results). You can run the following commands for testing, then modify the relevant parameters (such as *model_names*, *input_folder*, *model_pth* and *openai_api*) to suit the text-to-video (T2V) generation model you want to evaluate.
```bash
python evaluate.py \
--eval_type "open" \
--model_names test \
# or more than one model
# --model_names name1 name2 \
--input_folder toy_video \
--output_folder results \
--video_frames_folder video_frames_folder_temp \
--model_pth_CHScore cotracker2.pth \
--model_pth_MTScore InternVideo2-stage2_1b-224p-f4.pt \
--num_workers 8 \
--openai_api "sk-UybXXX" \
```If you only want to evaluate any one of the metrics instead of calculating all of them, you can follow the step below. Before running, please modify the parameters in *'xxx.sh'* as needed. (If you want to obtain the [JSON](https://github.com/PKU-YuanGroup/ChronoMagic-Bench/blob/main/LeadBoard/file/ChronoMagic-Bench-Input.json) to submit to the [leaderboard](https://huggingface.co/spaces/BestWishYsh/ChronoMagic-Bench), you can organize the output files in *MTScore / CHScore / GPT4o-MTScore* according to ['results'](https://github.com/PKU-YuanGroup/ChronoMagic-Bench/tree/main/results) and then proceed with the following steps.)
```bash
# for MTScore
cd MTScore
bash get_chscore.sh# for CHScore
cd CHScore
bash get_mtscore.sh# for GPT4o-MTScore
cd GPT4o_MTScore
bash get_gp4omtscore.sh
```### Get UMT-FVD and UMTScore
Please refer to the folder [UMT](https://github.com/PKU-YuanGroup/ChronoMagic-Bench/tree/main/UMT) for how to compute the UMTScore.
### Get File and Submit to Leaderboard
```bash
python get_uploaded_json.py \
--input_path results/all \
--output_path results
```After completing the above steps, you will obtain [ChronoMagic-Bench-Input.json](https://github.com/PKU-YuanGroup/ChronoMagic-Bench/blob/main/LeadBoard/file/ChronoMagic-Bench-Input.json), and then you need to manually fill the [JSON](https://github.com/PKU-YuanGroup/ChronoMagic-Bench/blob/main/LeadBoard/file/ChronoMagic-Bench-Input.json) with UMT-FVD and UMTScore (as we calculate them separately). Finally, you can submit the [JSON](https://github.com/PKU-YuanGroup/ChronoMagic-Bench/blob/main/LeadBoard/file/ChronoMagic-Bench-Input.json) to [HuggingFace](https://huggingface.co/spaces/BestWishYsh/ChronoMagic-Bench).
## :surfer: Sampled Videos
[](https://huggingface.co/datasets/BestWishYsh/ChronoMagic-Bench/tree/main/Results)
To facilitate future research and to ensure full transparency, we release all the videos we sampled and used for *ChronoMagic-Bench* evaluation. You can download them on [Hugging Face](https://huggingface.co/datasets/BestWishYsh/ChronoMagic-Bench/tree/main/Results). We also provide detailed explanations of the sampled videos and detailed setting for the models under evaluation [here](https://arxiv.org/abs/2406.18522).
## 🐳 ChronoMagicPro Dataset
*ChronoMagic-Pro* with **460K** time-lapse videos, each accompanied by a detailed caption. We also released the **150K** subset (*ChronoMagic-ProH*), which is a higher quality subset. All the dataset can be downloaded at [here](https://huggingface.co/datasets/BestWishYsh/ChronoMagic-Pro) and [here](https://huggingface.co/datasets/BestWishYsh/ChronoMagic-ProH), or you can download it with the following command. Some samples can be found on our [Project Page](https://pku-yuangroup.github.io/ChronoMagic-Bench/).```bash
huggingface-cli download --repo-type dataset \
--resume-download BestWishYsh/ChronoMagic-Pro \ # or BestWishYsh/ChronoMagic-ProH
--local-dir BestWishYsh/ChronoMagic-Pro \ # or BestWishYsh/ChronoMagic-ProH
--local-dir-use-symlinks False
```Please refer to the folder [Multi-Aspect_Preprocessing](https://github.com/PKU-YuanGroup/ChronoMagic-Bench/tree/main/Multi-Aspect_Preprocessing) for how *ChronoMagic-Pro* to process this data.
## 👍 Acknowledgement
* This project wouldn't be possible without the following open-sourced repositories:
[CoTracker](https://github.com/facebookresearch/co-tracker), [InternVideo2](https://github.com/OpenGVLab/InternVideo/tree/main/InternVideo2), [UMT](https://github.com/OpenGVLab/unmasked_teacher), [FETV](https://github.com/llyx97/FETV-EVAL), [VBench](https://github.com/Vchitect/VBench), [Panda-70M](https://github.com/snap-research/Panda-70M), [ShareGPT4Video](https://sharegpt4video.github.io/) and [LAION Aesthetic Predictor](https://github.com/LAION-AI/aesthetic-predictor).## 🔒 License
* The majority of this project is released under the Apache 2.0 license as found in the [LICENSE](https://github.com/PKU-YuanGroup/ChronoMagic-Bench/blob/main/LICENSE) file.
* The service is a research preview. Please contact us if you find any potential violations. (shyuan-cs@hotmail.com)## ✏️ Citation
If you find our paper and code useful in your research, please consider giving a star :star: and citation :pencil:.
```BibTeX
@article{yuan2024chronomagic,
title={ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation},
author={Yuan, Shenghai and Huang, Jinfa and Xu, Yongqi and Liu, Yaoyang and Zhang, Shaofeng and Shi, Yujun and Zhu, Ruijie and Cheng, Xinhua and Luo, Jiebo and Yuan, Li},
journal={arXiv preprint arXiv:2406.18522},
year={2024}
}
```## 🤝 Contributors