https://github.com/iceclear/seedvr

[CVPR2025 Highlight] SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration
https://github.com/iceclear/seedvr

Last synced: 5 months ago
JSON representation

[CVPR2025 Highlight] SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration

Host: GitHub
URL: https://github.com/iceclear/seedvr
Owner: IceClear
License: apache-2.0
Created: 2025-03-07T18:12:15.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-06-11T06:55:38.000Z (about 1 year ago)
Last Synced: 2025-06-11T07:47:48.483Z (about 1 year ago)
Homepage:
Size: 307 KB
Stars: 86
Watchers: 18
Forks: 0
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

SeedVR:

Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration

Jianyi Wang^1,2
Zhijie Lin²
Meng Wei²
Yang Zhao²
Ceyuan Yang²
Fei Xiao²
Chen Change Loy¹
Lu Jiang²

¹S-Lab, Nanyang Technological University
²ByteDance

CVPR 2025 (Highlight)

SeedVR is a large diffusion-transformer model that is capable of restoring videos with any resolutions,
w/o relying on any additional diffusion prior.

---

## 🔥 Update
- [2025.06] 🔥🔥🔥 [Inference code](https://github.com/ByteDance-Seed/SeedVR/tree/main) and [model weights](https://huggingface.co/models?other=seedvr) released!
- [2025.03] Repo created. The open-source process depends on the company policy and we will keep updating the news in this page.

---

> **Why SeedVR:** *Conventional restoration models achieve inferior performance on both real-world and AIGC video restoration due to limited generation ability. Recent diffusion-based models improve the performance by introducing diffusion prior via ControlNet-like or adaptor-like architectures. Though gaining improvement, these methods generally suffer from constraints brought by the diffusion prior: these models suffer from the same bias as the prior, e.g., limited generation ability on small texts and faces, etc, and only work on fixed resolutions such as 512 or 1024. As a result, most of the existing diffusion-based restoration models rely on patch-based sampling, i.e., dividing the input video into overlapping spatial-temporal patches and fusing these patches using a Gaussian kernel at each diffusion step. The large overlap (e.g., 50\% of the patch size), required for ensuring a coherent output without visible patch boundaries, often leads to considerably slow inference speed. This inefficiency becomes even more pronounced when processing long videos at high resolutions. SeedVR follows SOTA video generation training pipelines to tackle the key challenge in diffusion-based restoration, i.e., by enabling arbitrary-resolution restoration w/o relying on any pretrained diffusion prior and introducing advanced video generation technologies suitable for video restoration. Serving as the largest-ever diffusion transformer model towards generic video restoration, we hope SeedVR could push the frontiers of advanced VR and inspire future research in developing large vision models for real-world video restoration.*

## 📑 Citation

If you find our repo useful for your research, please consider citing our paper:

```bibtex
@inproceedings{wang2025seedvr,
title={SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration},
author={Wang, Jianyi and Lin, Zhijie and Wei, Meng and Zhao, Yang and Yang, Ceyuan and Loy, Chen Change and Jiang, Lu},
booktitle={CVPR},
year={2025}
}
```

## 📝 License

This project is under [Apache 2.0](https://github.com/ByteDance-Seed/SeedVR?tab=Apache-2.0-1-ov-file#readme). Redistribution and use should follow this license.

## 📧 Contact
If you have any questions, please feel free to reach us at `iceclearwjy@gmail.com`.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/iceclear/seedvr

Awesome Lists containing this project

README

SeedVR:

Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration