https://github.com/oelin/sora-xl

A prospective method for eXtremely Long-form video generation using a combination of Sora and NUWA-XL.
https://github.com/oelin/sora-xl

Last synced: 3 months ago
JSON representation

A prospective method for eXtremely Long-form video generation using a combination of Sora and NUWA-XL.

Host: GitHub
URL: https://github.com/oelin/sora-xl
Owner: oelin
License: mit
Created: 2024-02-16T22:49:32.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-02-22T15:42:59.000Z (over 1 year ago)
Last Synced: 2024-02-22T16:48:36.850Z (over 1 year ago)
Homepage:
Size: 11.7 KB
Stars: 2
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Sora-XL

A prospective method for eXtremely Long-form video generation using a combination of Sora and NUWA-XL.

## 1. Introduction

[Sora](https://openai.com/sora) is a SOTA video synthesis diffusion model, capable of generating samples of unprecedented realism. However, samples are limited to ~1m. While impressive, this is not sufficient for eXtremely Long-form (XL) video generation. [NUWA-XL](https://arxiv.org/abs/2303.12346) is a framework for producing XL video content using a hierarchical cascade of video diffusion models. NUWA-XL starts by sampling a set of coarse keyframes, representing the overall narrative of the video. Then, local diffusion models are used to sample increasingly fine keyframes in a recursive fashion. This hierarchical approach leads to coherent dynamics over long durations (e.g. 10+ minutes). However, samples are of far lower quality than Sora. The idea of Sora-XL is to combine both approaches, using Sora as a local diffusion model to interpolate between keyframes. This could allow for high quality XL video generation.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/oelin/sora-xl

Awesome Lists containing this project

README