https://github.com/SeonmiP/KineTy
Official Code for "Kinetic Typography Diffusion Model (ECCV 2024)"
https://github.com/SeonmiP/KineTy
eccv2024 kinetic-typography video-generation
Last synced: 3 months ago
JSON representation
Official Code for "Kinetic Typography Diffusion Model (ECCV 2024)"
- Host: GitHub
- URL: https://github.com/SeonmiP/KineTy
- Owner: SeonmiP
- License: other
- Created: 2024-07-11T09:14:44.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2025-01-11T21:53:14.000Z (5 months ago)
- Last Synced: 2025-01-11T22:32:23.356Z (5 months ago)
- Topics: eccv2024, kinetic-typography, video-generation
- Language: Python
- Homepage: https://seonmip.github.io/kinety/
- Size: 16.4 MB
- Stars: 42
- Watchers: 8
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
- awesome-diffusion-categorized - [Code
README
Kinetic Typography Diffusion Model
Seonmi Park
·
Inhwan Bae
·
Seunghyun Shin
·
Hae-Gon Jeon
ECCV 2024
Project Page
ECCV Paper
arXiv
Dataset
![]()
Example of our generated videos.## Source Code
We provide source codes of our KineTy model. Details are as follows.
### 🏢 Installation
#### Setup conda environment
```
git clone https://github.com/SeonmiP/KineTy.git
cd KineTy
conda env create -f environment.yaml
conda activate kinety
```#### Download Stable Diffusion V1.5
```
git lfs install
git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 models/StableDiffusion/
```### 💾 Dataset
We provide how to make dataset [here](https://github.com/SeonmiP/KineTy/blob/main/dataset_construction)### ⚽ Training
We trained our code on a machine with 8 NVIDIA A100 GPU.```
torchrun --nnodes=1 --nproc_per_node=1 train.py --config configs/train.yaml
```### 🎨 Inference
Our code is executed on an NVIDIA A100 GPU, but we also check if it runs on an NVIDIA GeForce 3090 Ti.```
python -m inference --config configs/inference.yaml
```## Acknowledgements
Part of our code is built upon [AnimateDiff](https://github.com/guoyww/AnimateDiff/tree/main) and [Tune-a-Video](https://github.com/showlab/Tune-A-Video). The visualization of the attention map refers to [FateZero](https://github.com/ChenyangQiQi/FateZero/tree/main) and [prompt-to-prompt](https://github.com/google/prompt-to-prompt/). Thanks to the authors for sharing their works.