Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/yinboc/infd
Image Neural Field Diffusion Models, CVPR 2024 (Highlight)
https://github.com/yinboc/infd
diffusion-models machine-learning neural-fields pytorch
Last synced: about 1 month ago
JSON representation
Image Neural Field Diffusion Models, CVPR 2024 (Highlight)
- Host: GitHub
- URL: https://github.com/yinboc/infd
- Owner: yinboc
- License: bsd-3-clause
- Created: 2024-08-25T23:36:56.000Z (3 months ago)
- Default Branch: master
- Last Pushed: 2024-08-25T23:49:48.000Z (3 months ago)
- Last Synced: 2024-09-27T20:03:30.932Z (about 2 months ago)
- Topics: diffusion-models, machine-learning, neural-fields, pytorch
- Language: Python
- Homepage: https://yinboc.github.io/infd/
- Size: 282 KB
- Stars: 37
- Watchers: 2
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Image Neural Field Diffusion Models
![infd](https://github.com/user-attachments/assets/e8296750-6ec0-4917-8eb5-7dedd6c85dbb)
Official implementation of the paper:
[**Image Neural Field Diffusion Models**](https://arxiv.org/abs/2406.07480)
Yinbo Chen, Oliver Wang, Richard Zhang, Eli Shechtman, Xiaolong Wang, Michael Gharbi
CVPR 2024 (Highlight)Contact [email protected] for any issues about the code.
## Environment
```
conda create -n infd python=3.8 -y
conda activate infd
pip install -r requirements.txt
```## Training
Below shows an example for training on FFHQ-1024 with 8 GPUs.
Download the FFHQ dataset ([images1024x1024.zip](https://drive.google.com/drive/folders/1WocxvZ4GEZ1DI8dOz30aSj2zT6pkATYS)). Unzip it and put the image folder as `load/ffhq/ffhq_1024`.
To visualize with wandb, complete information in `wandb.yaml` and append `-w` in running commands.
To train for the FFHQ-6K-Mix setting, append `-mix6000` to the yaml config names.
### 1. Autoencoding stage
```
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --standalone --nproc-per-node=8 run.py --cfg cfgs/ae_ffhq.yaml
```### 2. Latent diffusion stage
First resize the images for faster loading:
```
python resize_images.py --input load/ffhq/ffhq_1024 --output load/ffhq/ffhq_lanczos256
```Then run:
```
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --standalone --nproc-per-node=8 run.py --cfg cfgs/dm_ffhq.yaml
```### Custom Datasets
To train on custom datasets, use `ae_custom.yaml`, `dm_custom.yaml` as cfg and replace root_path in configs with path to the image folder.
## Evaluation
### 1. Generate samples
Can use a single or multiple GPUs. For example, with 2 GPUs:
```
CUDA_VISIBLE_DEVICES=0,1 python gen_samples.py --model save/dm_ffhq/last-model.pth --n-samples 50000 --batch-size 32 -o save/gen_samples --output-sizes 1024
```By default it uses the sampler defined in the model (200 DDIM steps, eta=1, following LDM).
### 2. Evaluate patch FID
```
CUDA_VISIBLE_DEVICES=0 python eval_pfid.py --input1 load/ffhq/ffhq_1024 --input2 save/gen_samples/1024
```## Citation
```
@inproceedings{chen2024image,
title={Image Neural Field Diffusion Models},
author={Chen, Yinbo and Wang, Oliver and Zhang, Richard and Shechtman, Eli and Wang, Xiaolong and Gharbi, Michael},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={8007--8017},
year={2024}
}
```