https://github.com/ryushinn/ode-video
The official repo for [Generative Video Bi-flow, ICCV 2025]
https://github.com/ryushinn/ode-video
flow-matching neural-ode video-generation
Last synced: 4 months ago
JSON representation
The official repo for [Generative Video Bi-flow, ICCV 2025]
- Host: GitHub
- URL: https://github.com/ryushinn/ode-video
- Owner: ryushinn
- License: mit
- Created: 2025-08-10T20:42:35.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2025-09-06T17:06:29.000Z (4 months ago)
- Last Synced: 2025-09-06T19:10:53.986Z (4 months ago)
- Topics: flow-matching, neural-ode, video-generation
- Language: Python
- Homepage: https://ryushinn.github.io/ode-video
- Size: 16.6 KB
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Generative Video Bi-flow
> by [Chen Liu](https://ryushinn.github.io/) and [Tobias Ritschel](https://www.homepages.ucl.ac.uk/~ucactri/)
>
> _International Conference on Computer Vision (ICCV 2025)_
>
> Please also check out our ([Paper](https://arxiv.org/abs/2503.06364) | [Project Page](https://ryushinn.github.io/ode-video))
This repo provides the official implementation of our paper in PyTorch.
We have also provided a compact [pseudocode](https://ryushinn.github.io/ode-video#method) that shows the core logic of our `bi-flow` algorithm, without bogging you down in all the less relevant code files.
## Setup
### Install
```bash
# 1. Clone the repo
git clone https://github.com/ryushinn/ode-video.git
cd ode-video
# 2. Recommend installing in a new virtual env with python 3.10, such as conda:
conda create -n ode-video python=3.10
conda activate ode-video
# 3. Install the dependencies
pip install -r requirements.txt
```
Our test environment is Ubuntu 22.04.4 x64 and NVIDIA RTX4090 GPU with CUDA 12.
### Data
Our dataloader expects the following folder structure:
```bash
data
└── {Dataset}
├── {train_split}
│ └── ... # Nested folders are allowed
│ └── {clip_folder}
│ ├── 000001.jpg # first frame
│ ├── 000002.jpg # second frame
│ └── ...
└── {test_split}
└── ...
└── {clip_folder}
├── 000001.jpg
├── 000002.jpg
└── ...
```
Every (sub)folder in train or test split should only contain consecutive frames from the same video clip, which are named in sorted order.
For example, you can setup `sky` dataset as in the above format, using:
```bash
# If daily download limit was reached, please download manually at
# https://drive.google.com/uc?id=1xWLiU-MBGN7MrsFHQm4_yXmfHBsMbJQo
gdown 1xWLiU-MBGN7MrsFHQm4_yXmfHBsMbJQo -O sky_timelapse.zip
unzip sky_timelapse.zip -d data
rm sky_timelapse.zip
```
For those datasets at a different format other than frames, you can use `scripts/pt_to_frames.py` (e.g., [`CARLA`](https://github.com/plai-group/flexible-video-diffusion-modeling?tab=readme-ov-file#preparing-data)) or `scripts/video_to_frames.py` (e.g., [`minerl`](https://archive.org/details/minerl_navigate) and [`mazes`](https://archive.org/details/gqn_mazes)) to convert them to image frames.
If the dataset does not come with a default train-test split, you can use `scripts/split.py` to setup one, e.g., for [`biking`](https://github.com/NVlabs/long-video-gan?tab=readme-ov-file#preparing-datasets) and [`riding`](https://github.com/NVlabs/long-video-gan?tab=readme-ov-file#preparing-datasets).
## Usages
### Pre-trained weights
You can download the [pre-trained weights](https://drive.google.com/file/d/1SOylrO6udRW_Qd6YRRIXHnv3FmHc3ukL/view?usp=sharing) for six datasets we report in our paper.
```bash
# If daily download limit was reached, please download manually
gdown 1SOylrO6udRW_Qd6YRRIXHnv3FmHc3ukL -O checkpoints_ode-video.zip
unzip checkpoints_ode-video.zip
rm checkpoints_ode-video.zip
```
### Training (from scratch)
We use Huggingface [accelerate](https://github.com/huggingface/accelerate) to setup gradient accumulation and mixed precision training.
The default arguments are already specified in the script.
In case you want to modify, please use `accelerate config`.
```bash
# USAGE:
# train_videoflow.sh
# ARGS:
# : the folder of your training dataset
# : the folder to save checkpoints and logs
# : resize the training images to this size
# : the number of GPUs
# : the number of steps you accumulate the gradients from several batches.
# This will NOT affect the actual batch size,
# but allow you to use a large batch size in limited GPU memory
# by performing one optimizer step after several backward passes
bash scripts/train_videoflow.sh data/sky_timelapse/sky_train experiments_weights/sky 128 1 4
```
Above is an example to train `condiff`, `flow`, and `bi-flow` for the dataset `sky`.
If out of GPU memory, you can use more accumulation steps.
### Sampling
> Note that our trained ODEs can generate next frames but the first frame has to be given or generated separately.
> Thus you would need to setup the test split of the corresponding dataset to sample (generate) videos using the trained weights.
To sample the trained models, you can use:
```bash
# USAGE:
# sample_videoflow.sh
# ARGS:
# : the folder of your test dataset
# : the folder of your training checkpoints and logs
# : the folder to save sampling results
# : the image size you sample
# : the number of videos generated, must be a multiple of the batch size
# : the batch size in sampling
# : the number of frames in each sample
bash scripts/sample_videoflow.sh data/sky_timelapse/sky_test experiments_weights/sky experiments_inference/sky 128 64 8 32
```
The above command generates 64 videos, each of which has 32 frames. The sampling script will sample `condiff` and `flow`, together with `bi-flow` under four different levels of inference noises.
## Citation
If you find this useful or adopt (parts of) our project, please cite our paper:
```bibtex
@article{liu2025generative,
title={Generative Video Bi-flow},
author={Liu, Chen and Ritschel, Tobias},
journal={arXiv preprint arXiv:2503.06364},
year={2025}
}
```