https://github.com/compvis/scflow

[ICCV 2025] SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models
https://github.com/compvis/scflow

Last synced: 7 days ago
JSON representation

[ICCV 2025] SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models

Host: GitHub
URL: https://github.com/compvis/scflow
Owner: CompVis
License: mit
Created: 2025-06-27T18:02:51.000Z (8 months ago)
Default Branch: main
Last Pushed: 2025-08-12T15:03:31.000Z (6 months ago)
Last Synced: 2025-09-10T05:24:25.498Z (5 months ago)
Language: Jupyter Notebook
Size: 36.6 MB
Stars: 23
Watchers: 0
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE.txt

Awesome Lists containing this project

README

SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models

Pingchuan Ma^* · Xiaopei Yang^* · Yusong Li

Ming Gui · Felix Krause · Johannes Schusterbauer · Björn Ommer

CompVis Group @ LMU Munich Munich Center for Machine Learning (MCML)

^* equal contribution

📄 ICCV 2025

This repository contains the official implementation of the paper "SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models".
We proposed a flow-matching framework that learns an invertible mapping between style-content mixtures and their separate representations, avoiding explicit disentanglement objectives. Together with the method, we have curated a 510k synthetic dataset consisting of 10k content instances and 51 distinct styles.

Cover

## 🛠️ Setup
Create the enviroment with conda:
```bash
conda create -n scflow python=3.10
conda activate scflow
pip install -r requirements.txt
```
The enviroment was tested on `Ubuntu 22.04.5 LTS` with `CUDA 12.1`. You can *optionally* install jupyter-notebook to run the notebook provided in [`notebooks`](https://github.com/CompVis/SCFlow/tree/main/notebooks)

Download the model checkpoints:
```bash
mkdir ckpts
cd ckpts

# model checkpoint
wget https://huggingface.co/CompVis/SCFlow/resolve/main/scflow_last.ckpt

# unclip checkpoint for visualization
wget https://huggingface.co/CompVis/SCFlow/resolve/main/sd21-unclip-l.ckpt
```

Download the training and test splits of the dataset:
```bash
# return to parent dir
cd ..
mkdir dataset
cd dataset

# training split with meta data, e.g., content and style idx and content description etc.
wget https://huggingface.co/CompVis/SCFlow/resolve/main/train.h5

# test split with meta data, e.g., content and style idx and content description etc.
wget https://huggingface.co/CompVis/SCFlow/resolve/main/test.h5

```

## 🔥 Usage
The following bash scripts are just naive wrappers for an easy start. You can the args accordingly by calling directly the `training.py` and `inference.py`.

Inference forward (merge content and style)
```bash
bash scripts/inference_forward.sh
```
Inference reverse (disentangle content and style from a given reference)
```bash
bash scripts/inference_reverse.sh
```

For training you would need ~22GB with the default setting.
```bash
bash scripts/training.sh
```

## 🗂️ Dataset Overview
We hosted the dataset (currently only the clip embeddings and their corresponding metadata due to the space limit) on HF. You can download them as instructed in the above section. The file `train.h5` (same holds for `test.h5`) is an HDF5 dataset storing embeddings and metadata useful for training. You can load it in Python with:

```python
import h5py
train = h5py.File(”./dataset/train.h5”, ‘r’)
```

The main groups inside are:

- **images**: Contains CLIP embeddings with shape `(357000, 768)`, representing feature vectors for training samples.
- **metadata**: Contains descriptive information with keys:
- `content_description`
- `content_idx`
- `style_idx`
- `style_name`

> **Note:** Some metadata entries can be duplicated because there are 7000 content variations for training and 3000 for testing. This means the same content with different styles will have identical `content_description` and `content_idx`.

### Original Images in 512px
We hosted the original images on HF. You should be able to download them by calling:
```bash

# The zip file is around 36.5 GB.
wget https://huggingface.co/CompVis/SCFlow/resolve/main/raw_512px.zip

```
It is structured by styles, then different content ids, e.g., `Cubism/00001.jpg ... 10000.jpg`, where the content ids are consistent across different styles.

## 🎓 Citation & Contact

If you use this codebase and dataset, or found our work valuable, please cite our paper:
```bibtex
@inproceedings{ma2025scflow,
author = {Ma, Pingchuan and Yang, Xiaopei and Li, Yusong and Gui, Ming and Krause, Felix and Schusterbauer, Johannes and Ommer, Bj\"orn},
title = {SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2025},
pages = {14919-14929}
}
```

In case you encounter any issues or would like to collaborate, plz feel free to drop me a message:
* Email: p.ma(at)lmu(dot)de
* [linkedin](https://www.linkedin.com/in/pingchuan-ma-492543156/)

## 🔥 Updates and Backlogs
- [x] **[06.08.2025]** [ArXiv](https://arxiv.org/abs/2508.03402) paper avaiable.
- [x] **[12.08.2025]** Release Inference code and ckpt.
- [x] **[31.10.2025]** Host the dataset (latent and metadata) and training code.
- [x] **[31.10.2025]** Uploaded the 512px jpg images in HF.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/compvis/scflow

Awesome Lists containing this project

README

SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models