https://github.com/ailab-cvc/freenoise

[ICLR 2024] Code for FreeNoise based on VideoCrafter
https://github.com/ailab-cvc/freenoise

aigc diffusion generative-model video-diffusion-model

Last synced: 6 months ago
JSON representation

[ICLR 2024] Code for FreeNoise based on VideoCrafter

Host: GitHub
URL: https://github.com/ailab-cvc/freenoise
Owner: AILab-CVC
License: apache-2.0
Created: 2023-10-23T17:47:05.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-01-31T08:37:37.000Z (over 1 year ago)
Last Synced: 2024-05-16T00:00:24.615Z (over 1 year ago)
Topics: aigc, diffusion, generative-model, video-diffusion-model
Language: Python
Homepage: http://haonanqiu.com/projects/FreeNoise.html
Size: 81.6 MB
Stars: 327
Watchers: 6
Forks: 24
Open Issues: 7
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          ## ___***FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling***___

### 🔥🔥🔥 The LongerCrafter for longer high-quality video generation are now released!





✅ totally no tuning     

✅ less than 20% extra time     

✅ support 512 frames     



       

       

 [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/MoonQiu/FreeNoise)      

 [![Replicate](https://replicate.com/cjwbw/longercrafter/badge)](https://replicate.com/cjwbw/longercrafter)

_**[Haonan Qiu](http://haonanqiu.com/), [Menghan Xia*](https://menghanxia.github.io), [Yong Zhang](https://yzhang2016.github.io), [Yingqing He](https://github.com/YingqingHe), 




[Xintao Wang](https://xinntao.github.io), [Ying Shan](https://scholar.google.com/citations?hl=zh-CN&user=4oXBp9UAAAAJ), and [Ziwei Liu*](https://liuziwei7.github.io/)**_





(* corresponding author)

From Tencent AI Lab and Nanyang Technological University.



Input: "A chihuahua in astronaut suit floating in space, cinematic lighting, glow effect"; 




Resolution: 1024 x 576; Frames: 64.



Input: "Campfire at night in a snowy forest with starry sky in the background"; 




Resolution: 1024 x 576; Frames: 64.



 

## 🔆 Introduction

🤗🤗🤗 LongerCrafter (FreeNoise) is a tuning-free and time-efficient paradigm for longer video generation based on pretrained video diffusion models.

### 1. Longer Single-Prompt Text-to-video Generation





Longer single-prompt results. Resolution: 256 x 256; Frames: 512. (Compressed)



### 2. Longer Multi-Prompt Text-to-video Generation





Longer multi-prompt results. Resolution: 256 x 256; Frames: 256. (Compressed)



## 📝 Changelog

- __[2024.01.28]__: 🔥🔥 Support FreeNoise on VideoCrafter2!

- __[2024.01.23]__: 🔥🔥 Support FreeNoise on other two video frameworks AnimateDiff and LaVie!

- __[2023.10.25]__: 🔥🔥 Release the 256x256 model and support multi-prompt generation!

- __[2023.10.24]__: 🔥🔥 Release the LongerCrafter (FreeNoise), longer video generation!




## 🧰 Models

|Model|Resolution|Checkpoint|Description

|:---------|:---------|:--------|:--------|

|VideoCrafter (Text2Video)|576x1024|[Hugging Face](https://huggingface.co/VideoCrafter/Text2Video-1024-v1.0/blob/main/model.ckpt)|Support 64 frames on NVIDIA A100 (40GB)

|VideoCrafter (Text2Video)|256x256|[Hugging Face](https://huggingface.co/VideoCrafter)|Support 512 frames on NVIDIA A100 (40GB)

|VideoCrafter2 (Text2Video)|320x512|[Hugging Face](https://huggingface.co/VideoCrafter/VideoCrafter2/blob/main/model.ckpt)|Support 128 frames on NVIDIA A100 (40GB)

(Reduce the number of frames when you have smaller GPUs, e.g. 256x256 resolutions with 64 frames.)

## ⚙️ Setup

### Install Environment via Anaconda (Recommended)

```bash

conda create -n freenoise python=3.8.5

conda activate freenoise

pip install -r requirements.txt

```

## 💫 Inference 

### 1. Longer Text-to-Video

1) Download pretrained T2V models via [Hugging Face](https://huggingface.co/VideoCrafter/Text2Video-1024-v1.0/blob/main/model.ckpt), and put the `model.ckpt` in `checkpoints/base_1024_v1/model.ckpt`.

2) Input the following commands in terminal.

```bash

  sh scripts/run_text2video_freenoise_1024.sh

```

### 2. Longer Multi-Prompt Text-to-Video

1) Download pretrained T2V models via [Hugging Face](https://huggingface.co/VideoCrafter), and put the `model.ckpt` in `checkpoints/base_256_v1/model.ckpt`.

2) Input the following commands in terminal.

```bash

  sh scripts/run_text2video_freenoise_mp_256.sh

```

## 🧲 Support For Other Models

FreeNoise is supposed to work on other similar frameworks. An easy way to test compatibility is by shuffling the noise to see whether a new similar video can be generated (set eta to 0). If your have any questions about applying FreeNoise to other frameworks, feel free to contact [Haonan Qiu](http://haonanqiu.com/).

Current official implementation: [FreeNoise-VideoCrafter](https://github.com/AILab-CVC/FreeNoise), [FreeNoise-AnimateDiff](https://github.com/arthur-qiu/FreeNoise-AnimateDiff), [FreeNoise-LaVie](https://github.com/arthur-qiu/FreeNoise-LaVie) 

## 👨‍👩‍👧‍👦 Crafter Family

[VideoCrafter](https://github.com/AILab-CVC/VideoCrafter): Framework for high-quality video generation.

[ScaleCrafter](https://github.com/YingqingHe/ScaleCrafter): Tuning-free method for high-resolution image/video generation.

[TaleCrafter](https://github.com/AILab-CVC/TaleCrafter): An interactive story visualization tool that supports multiple characters.  

## 😉 Citation

```bib

@misc{qiu2023freenoise,

      title={FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling}, 

      author={Haonan Qiu and Menghan Xia and Yong Zhang and Yingqing He and Xintao Wang and Ying Shan and Ziwei Liu},

      year={2023},

      eprint={2310.15169},

      archivePrefix={arXiv},

      primaryClass={cs.CV}

}

```

## 📢 Disclaimer

We develop this repository for RESEARCH purposes, so it can only be used for personal/research/non-commercial purposes.

****

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ailab-cvc/freenoise

Awesome Lists containing this project

README