An open API service indexing awesome lists of open source software.

https://github.com/sungnyun/diffblender

DiffBlender: Scalable and Composable Multimodal Text-to-Image Diffusion Models
https://github.com/sungnyun/diffblender

diffusion generative-model multimodal text-to-image

Last synced: 30 days ago
JSON representation

DiffBlender: Scalable and Composable Multimodal Text-to-Image Diffusion Models

Awesome Lists containing this project

README

        

# DiffBlender: Scalable and Composable Multimodal Text-to-Image Diffusion Models 🔥



- **DiffBlender** successfully synthesizes complex combinations of input modalities. It enables flexible manipulation of conditions, providing the customized generation aligned with user preferences.
- We designed its structure to intuitively extend to additional modalities while achieving a low training cost through a partial update of hypernetworks.


teaser

## 🗓️ TODOs

- [x] Project page is open: [link](https://sungnyun.github.io/diffblender/)
- [x] DiffBlender model: code & checkpoint
- [x] Release inference code
- [ ] Release training code & pipeline
- [ ] Gradio UI

## 🚀 Getting Started
Install the necessary packages with:
```sh
$ pip install -r requirements.txt
```

Download DiffBlender model checkpoint from this [Huggingface model](https://huggingface.co/sungnyun/diffblender), and place it under `./diffblender_checkpoints/`.
Also, prepare the SD model from this [link](https://huggingface.co/CompVis/stable-diffusion-v-1-4-original) (we used CompVis/sd-v1-4.ckpt).

## ⚡️ Try Multimodal T2I Generation with DiffBlender
```sh
$ python inference.py --ckpt_path=./diffblender_checkpoints/{CKPT_NAME}.pth \
--official_ckpt_path=/path/to/sd-v1-4.ckpt \
--save_name={SAVE_NAME}
```

Results will be saved under `./inference/{SAVE_NAME}/`, in the format as {conditions + generated image}.


## BibTeX
```
@article{kim2023diffblender,
title={DiffBlender: Scalable and Composable Multimodal Text-to-Image Diffusion Models},
author={Kim, Sungnyun and Lee, Junsoo and Hong, Kibeom and Kim, Daesik and Ahn, Namhyuk},
journal={arXiv preprint arXiv:2305.15194},
year={2023}
}
```