An open API service indexing awesome lists of open source software.

https://github.com/ailab-cvc/freenoise

[ICLR 2024] Code for FreeNoise based on VideoCrafter
https://github.com/ailab-cvc/freenoise

aigc diffusion generative-model video-diffusion-model

Last synced: 6 months ago
JSON representation

[ICLR 2024] Code for FreeNoise based on VideoCrafter

Awesome Lists containing this project

README

          

## ___***FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling***___

### ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ The LongerCrafter for longer high-quality video generation are now released!



โœ… totally no tuning ย ย ย ย 
โœ… less than 20% extra time ย ย ย ย 
โœ… support 512 frames ย ย ย ย 

ย ย ย ย ย 
ย ย ย ย ย 
[![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/MoonQiu/FreeNoise) ย ย ย ย ย 
[![Replicate](https://replicate.com/cjwbw/longercrafter/badge)](https://replicate.com/cjwbw/longercrafter)

_**[Haonan Qiu](http://haonanqiu.com/), [Menghan Xia*](https://menghanxia.github.io), [Yong Zhang](https://yzhang2016.github.io), [Yingqing He](https://github.com/YingqingHe),


[Xintao Wang](https://xinntao.github.io), [Ying Shan](https://scholar.google.com/citations?hl=zh-CN&user=4oXBp9UAAAAJ), and [Ziwei Liu*](https://liuziwei7.github.io/)**_



(* corresponding author)

From Tencent AI Lab and Nanyang Technological University.


Input: "A chihuahua in astronaut suit floating in space, cinematic lighting, glow effect";


Resolution: 1024 x 576; Frames: 64.



Input: "Campfire at night in a snowy forest with starry sky in the background";


Resolution: 1024 x 576; Frames: 64.




## ๐Ÿ”† Introduction

๐Ÿค—๐Ÿค—๐Ÿค— LongerCrafter (FreeNoise) is a tuning-free and time-efficient paradigm for longer video generation based on pretrained video diffusion models.

### 1. Longer Single-Prompt Text-to-video Generation



Longer single-prompt results. Resolution: 256 x 256; Frames: 512. (Compressed)


### 2. Longer Multi-Prompt Text-to-video Generation



Longer multi-prompt results. Resolution: 256 x 256; Frames: 256. (Compressed)


## ๐Ÿ“ Changelog
- __[2024.01.28]__: ๐Ÿ”ฅ๐Ÿ”ฅ Support FreeNoise on VideoCrafter2!
- __[2024.01.23]__: ๐Ÿ”ฅ๐Ÿ”ฅ Support FreeNoise on other two video frameworks AnimateDiff and LaVie!
- __[2023.10.25]__: ๐Ÿ”ฅ๐Ÿ”ฅ Release the 256x256 model and support multi-prompt generation!
- __[2023.10.24]__: ๐Ÿ”ฅ๐Ÿ”ฅ Release the LongerCrafter (FreeNoise), longer video generation!

## ๐Ÿงฐ Models

|Model|Resolution|Checkpoint|Description
|:---------|:---------|:--------|:--------|
|VideoCrafter (Text2Video)|576x1024|[Hugging Face](https://huggingface.co/VideoCrafter/Text2Video-1024-v1.0/blob/main/model.ckpt)|Support 64 frames on NVIDIA A100 (40GB)
|VideoCrafter (Text2Video)|256x256|[Hugging Face](https://huggingface.co/VideoCrafter)|Support 512 frames on NVIDIA A100 (40GB)
|VideoCrafter2 (Text2Video)|320x512|[Hugging Face](https://huggingface.co/VideoCrafter/VideoCrafter2/blob/main/model.ckpt)|Support 128 frames on NVIDIA A100 (40GB)

(Reduce the number of frames when you have smaller GPUs, e.g. 256x256 resolutions with 64 frames.)

## โš™๏ธ Setup

### Install Environment via Anaconda (Recommended)
```bash
conda create -n freenoise python=3.8.5
conda activate freenoise
pip install -r requirements.txt
```

## ๐Ÿ’ซ Inference
### 1. Longer Text-to-Video

1) Download pretrained T2V models via [Hugging Face](https://huggingface.co/VideoCrafter/Text2Video-1024-v1.0/blob/main/model.ckpt), and put the `model.ckpt` in `checkpoints/base_1024_v1/model.ckpt`.
2) Input the following commands in terminal.
```bash
sh scripts/run_text2video_freenoise_1024.sh
```

### 2. Longer Multi-Prompt Text-to-Video

1) Download pretrained T2V models via [Hugging Face](https://huggingface.co/VideoCrafter), and put the `model.ckpt` in `checkpoints/base_256_v1/model.ckpt`.
2) Input the following commands in terminal.
```bash
sh scripts/run_text2video_freenoise_mp_256.sh
```

## ๐Ÿงฒ Support For Other Models

FreeNoise is supposed to work on other similar frameworks. An easy way to test compatibility is by shuffling the noise to see whether a new similar video can be generated (set eta to 0). If your have any questions about applying FreeNoise to other frameworks, feel free to contact [Haonan Qiu](http://haonanqiu.com/).

Current official implementation: [FreeNoise-VideoCrafter](https://github.com/AILab-CVC/FreeNoise), [FreeNoise-AnimateDiff](https://github.com/arthur-qiu/FreeNoise-AnimateDiff), [FreeNoise-LaVie](https://github.com/arthur-qiu/FreeNoise-LaVie)

## ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ Crafter Family
[VideoCrafter](https://github.com/AILab-CVC/VideoCrafter): Framework for high-quality video generation.

[ScaleCrafter](https://github.com/YingqingHe/ScaleCrafter): Tuning-free method for high-resolution image/video generation.

[TaleCrafter](https://github.com/AILab-CVC/TaleCrafter): An interactive story visualization tool that supports multiple characters.

## ๐Ÿ˜‰ Citation
```bib
@misc{qiu2023freenoise,
title={FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling},
author={Haonan Qiu and Menghan Xia and Yong Zhang and Yingqing He and Xintao Wang and Ying Shan and Ziwei Liu},
year={2023},
eprint={2310.15169},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```

## ๐Ÿ“ข Disclaimer
We develop this repository for RESEARCH purposes, so it can only be used for personal/research/non-commercial purposes.
****