https://github.com/Zeqiang-Lai/Mini-DALLE3
Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models
https://github.com/Zeqiang-Lai/Mini-DALLE3
dall-e-3 dalle dalle-3 dalle3 interactive-text-to-image mini-dalle3
Last synced: 3 months ago
JSON representation
Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models
- Host: GitHub
- URL: https://github.com/Zeqiang-Lai/Mini-DALLE3
- Owner: Zeqiang-Lai
- Created: 2023-09-21T14:41:39.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-12-28T13:53:32.000Z (over 1 year ago)
- Last Synced: 2025-03-24T09:05:41.285Z (3 months ago)
- Topics: dall-e-3, dalle, dalle-3, dalle3, interactive-text-to-image, mini-dalle3
- Language: Python
- Homepage: https://minidalle3.github.io/
- Size: 168 KB
- Stars: 307
- Watchers: 4
- Forks: 29
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-diffusion-categorized - [Code
README
Technical Report •
Project page •
Demo (Temporarily Unavailable)https://github.com/Zeqiang-Lai/Mini-DALLE3/assets/26198430/5b6c0a0c-ebbf-48db-981e-f97d542a38b4

> An experimental attempt to obtain the interactive and interleave text-to-image and text-to-text experience of [DALL•E 3](https://openai.com/dall-e-3) and [ChatGPT](https://openai.com/chatgpt).
## Try Yourself 🤗
- Download the [checkpoint](https://huggingface.co/h94/IP-Adapter) and save it as following
```bash
checkpoints
- models
- sdxl_models
```- run the following commands, and you will get a gradio-based web demo.
```bash
export OPENAI_API_KEY="your key"
python -m minidalle3.web
```- To use other LLM rather than ChatGPT, such as `baichuan`.
```bash
python -m minidalle3.llm.baichuan
export OPENAI_API_BASE="http://0.0.0.0:10039/v1"
python -m minidalle3.web
```> `chatglm`, `baichuan`, `internlm` are tested.
> llama have not supported yet. qwen is not tested.## TODO
- [x] Support generating image interleaved in the conversations.
- [ ] Support generating multiple images at once.
- [ ] Support selecting image.
- [ ] Support refinement.
- [ ] Support prompt refinement/variation.
- [ ] Instruct tuned LLM/SD.## Citation
If you find this repo helpful, please consider citing us.
```bibtex
@misc{minidalle3,
author={Lai, Zeqiang and Zhu, Xizhou and Dai, Jifeng and Qiao, Yu and Wang, Wenhai},
title={Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models},
year={2023},
url={https://github.com/Zeqiang-Lai/Mini-DALLE3},
}
```## Acknowledgement
[IP-Adapter](https://github.com/tencent-ailab/IP-Adapter) • [Stable Diffusion XL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)
