https://github.com/ryushinn/ode-video

The official repo for [Generative Video Bi-flow, ICCV 2025]
https://github.com/ryushinn/ode-video

flow-matching neural-ode video-generation

Last synced: 4 months ago
JSON representation

The official repo for [Generative Video Bi-flow, ICCV 2025]

Host: GitHub
URL: https://github.com/ryushinn/ode-video
Owner: ryushinn
License: mit
Created: 2025-08-10T20:42:35.000Z (5 months ago)
Default Branch: main
Last Pushed: 2025-09-06T17:06:29.000Z (4 months ago)
Last Synced: 2025-09-06T19:10:53.986Z (4 months ago)
Topics: flow-matching, neural-ode, video-generation
Language: Python
Homepage: https://ryushinn.github.io/ode-video
Size: 16.6 KB
Stars: 2
Watchers: 0
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Generative Video Bi-flow

> by [Chen Liu](https://ryushinn.github.io/) and [Tobias Ritschel](https://www.homepages.ucl.ac.uk/~ucactri/)
>
> _International Conference on Computer Vision (ICCV 2025)_
>
> Please also check out our ([Paper](https://arxiv.org/abs/2503.06364) | [Project Page](https://ryushinn.github.io/ode-video))

This repo provides the official implementation of our paper in PyTorch.

We have also provided a compact [pseudocode](https://ryushinn.github.io/ode-video#method) that shows the core logic of our `bi-flow` algorithm, without bogging you down in all the less relevant code files.

## Setup

### Install

```bash
# 1. Clone the repo
git clone https://github.com/ryushinn/ode-video.git
cd ode-video
# 2. Recommend installing in a new virtual env with python 3.10, such as conda:
conda create -n ode-video python=3.10
conda activate ode-video
# 3. Install the dependencies
pip install -r requirements.txt
```

Our test environment is Ubuntu 22.04.4 x64 and NVIDIA RTX4090 GPU with CUDA 12.

### Data

Our dataloader expects the following folder structure:

```bash
data
└── {Dataset}
├── {train_split}
│ └── ... # Nested folders are allowed
│ └── {clip_folder}
│ ├── 000001.jpg # first frame
│ ├── 000002.jpg # second frame
│ └── ...
└── {test_split}
└── ...
└── {clip_folder}
├── 000001.jpg
├── 000002.jpg
└── ...
```

Every (sub)folder in train or test split should only contain consecutive frames from the same video clip, which are named in sorted order.

For example, you can setup `sky` dataset as in the above format, using:

```bash
# If daily download limit was reached, please download manually at
# https://drive.google.com/uc?id=1xWLiU-MBGN7MrsFHQm4_yXmfHBsMbJQo
gdown 1xWLiU-MBGN7MrsFHQm4_yXmfHBsMbJQo -O sky_timelapse.zip
unzip sky_timelapse.zip -d data
rm sky_timelapse.zip
```

For those datasets at a different format other than frames, you can use `scripts/pt_to_frames.py` (e.g., [`CARLA`](https://github.com/plai-group/flexible-video-diffusion-modeling?tab=readme-ov-file#preparing-data)) or `scripts/video_to_frames.py` (e.g., [`minerl`](https://archive.org/details/minerl_navigate) and [`mazes`](https://archive.org/details/gqn_mazes)) to convert them to image frames.

If the dataset does not come with a default train-test split, you can use `scripts/split.py` to setup one, e.g., for [`biking`](https://github.com/NVlabs/long-video-gan?tab=readme-ov-file#preparing-datasets) and [`riding`](https://github.com/NVlabs/long-video-gan?tab=readme-ov-file#preparing-datasets).

## Usages

### Pre-trained weights

You can download the [pre-trained weights](https://drive.google.com/file/d/1SOylrO6udRW_Qd6YRRIXHnv3FmHc3ukL/view?usp=sharing) for six datasets we report in our paper.

```bash
# If daily download limit was reached, please download manually
gdown 1SOylrO6udRW_Qd6YRRIXHnv3FmHc3ukL -O checkpoints_ode-video.zip
unzip checkpoints_ode-video.zip
rm checkpoints_ode-video.zip
```

### Training (from scratch)

We use Huggingface [accelerate](https://github.com/huggingface/accelerate) to setup gradient accumulation and mixed precision training.
The default arguments are already specified in the script.
In case you want to modify, please use `accelerate config`.

```bash
# USAGE:
# train_videoflow.sh
# ARGS:
# : the folder of your training dataset
# : the folder to save checkpoints and logs
# : resize the training images to this size
# : the number of GPUs
# : the number of steps you accumulate the gradients from several batches.
# This will NOT affect the actual batch size,
# but allow you to use a large batch size in limited GPU memory
# by performing one optimizer step after several backward passes

bash scripts/train_videoflow.sh data/sky_timelapse/sky_train experiments_weights/sky 128 1 4
```

Above is an example to train `condiff`, `flow`, and `bi-flow` for the dataset `sky`.
If out of GPU memory, you can use more accumulation steps.

### Sampling

> Note that our trained ODEs can generate next frames but the first frame has to be given or generated separately.
> Thus you would need to setup the test split of the corresponding dataset to sample (generate) videos using the trained weights.

To sample the trained models, you can use:

```bash
# USAGE:
# sample_videoflow.sh
# ARGS:
# : the folder of your test dataset
# : the folder of your training checkpoints and logs
# : the folder to save sampling results
# : the image size you sample
# : the number of videos generated, must be a multiple of the batch size
# : the batch size in sampling
# : the number of frames in each sample

bash scripts/sample_videoflow.sh data/sky_timelapse/sky_test experiments_weights/sky experiments_inference/sky 128 64 8 32
```

The above command generates 64 videos, each of which has 32 frames. The sampling script will sample `condiff` and `flow`, together with `bi-flow` under four different levels of inference noises.

## Citation

If you find this useful or adopt (parts of) our project, please cite our paper:

```bibtex
@article{liu2025generative,
title={Generative Video Bi-flow},
author={Liu, Chen and Ritschel, Tobias},
journal={arXiv preprint arXiv:2503.06364},
year={2025}
}
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ryushinn/ode-video

Awesome Lists containing this project

README