An open API service indexing awesome lists of open source software.

https://github.com/tyshiwo1/Accelerating-T2I-AR-with-SJD

[ICLR 2025] Implementation of Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
https://github.com/tyshiwo1/Accelerating-T2I-AR-with-SJD

Last synced: about 1 month ago
JSON representation

[ICLR 2025] Implementation of Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding

Awesome Lists containing this project

README

          

# SJD: Accelerating Auto-regressive Text-to-Image Generation with Training-free
Speculative Jacobi Decoding

[Yao Teng](https://tyshiwo1.github.io/)1, [Han Shi](https://han-shi.github.io/)2, [Xian Liu](https://alvinliu0.github.io/)3, [Xuefei Ning](https://nics-effalg.com/ningxuefei/)4, [Guohao Dai](https://dai.sjtu.edu.cn/)5,6, [Yu Wang](https://scholar.google.com.hk/citations?user=j8JGVvoAAAAJ)4, [Zhenguo Li](https://zhenguol.github.io/)2, and [Xihui Liu](https://xh-liu.github.io/)1.

*1The University of Hong Kong, 2Huawei Noah’s Ark Lab, 3CUHK, 4Tsinghua University, 5Shanghai Jiao Tong University, 6Infinigence AI*

## 🚩 New Features/Updates

- ✅ Apr, 2025. 💥 **SJD** has been integrated into [Lumina-mGPT2](https://github.com/Alpha-VLLM/Lumina-mGPT-2.0) and [SimpleAR](https://github.com/wdrink/SimpleAR).
- ✅ Jan, 2025. 💥 **SJD** is accepted to ICLR 2025.
- ✅ Oct, 2024. Release **SJD**'s code.

## 🚩 TODO List

- â–¡ Integrating SJD into vLLM framework for further acceleration.

## Installing the dependencies

##### Environment:

- Python 3.10
- CUDA 12.5
- Pytorch 2.5.1+cu124
- Transformers 4.47.1

##### Install from `yaml`:

```bash
conda env create -f environment.yaml
```

## Performance

- Results on [Lumina-mGPT](https://github.com/Alpha-VLLM/Lumina-mGPT)

drawing

- Results on [Emu3](https://github.com/baaivision/Emu3)

drawing

## Text-to-Image with SJD

#### Lumina-mGPT

```bash
CUDA_VISIBLE_DEVICES=0 python test_lumina_mgpt.py
```

#### Emu3

```bash
CUDA_VISIBLE_DEVICES=0 python test_emu3.py
```

#### LlamaGen

```bash
CUDA_VISIBLE_DEVICES=0 python test_llamagen.py
```

## Acknowledge

Our code is based on [Lumina-mGPT](https://github.com/Alpha-VLLM/Lumina-mGPT), [Emu3](https://github.com/Alpha-VLLM/Lumina-mGPT), [LlamaGen](https://github.com/FoundationVision/LlamaGen), [Anole](https://github.com/GAIR-NLP/anole), and [CLLM](https://github.com/hao-ai-lab/Consistency_LLM). We would like to express our gratitude to [Tianwei Xiong](https://github.com/SilentView) for his assistance.

## Citation

```bibtex
@article{teng2024accelerating,
title={Accelerating auto-regressive text-to-image generation with training-free speculative jacobi decoding},
author={Teng, Yao and Shi, Han and Liu, Xian and Ning, Xuefei and Dai, Guohao and Wang, Yu and Li, Zhenguo and Liu, Xihui},
journal={arXiv preprint arXiv:2410.01699},
year={2024}
}
```