https://github.com/eclipse-t2i/eclipse-inference

[CVPR 2024] Official PyTorch implementation of "ECLIPSE: Revisiting the Text-to-Image Prior for Efficient Image Generation"
https://github.com/eclipse-t2i/eclipse-inference

Last synced: 4 months ago
JSON representation

[CVPR 2024] Official PyTorch implementation of "ECLIPSE: Revisiting the Text-to-Image Prior for Efficient Image Generation"

Host: GitHub
URL: https://github.com/eclipse-t2i/eclipse-inference
Owner: eclipse-t2i
License: mit
Created: 2023-12-07T05:17:08.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-05-01T17:36:16.000Z (about 1 year ago)
Last Synced: 2024-10-31T00:39:54.941Z (9 months ago)
Language: Python
Homepage: https://eclipse-t2i.vercel.app/
Size: 3.54 MB
Stars: 60
Watchers: 4
Forks: 9
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-diffusion-categorized - [Code

README

        ## 
 [CVPR 2024] ECLIPSE: Revisiting the Text-to-Image Prior for Effecient Image Generation 




    

    

    





---

This repository contains the inference code for our paper, ECLIPSE.

We show how to utilize the pre-trained ECLIPSE text-to-image prior associated with diffusion image decoders such as Karlo and Kandinsky.

- ECLIPSE presents the tiny prior learning strategy that compresses the previous prior models from 1 billion parameters down to 33 million parameters.

- Additionally, ECLIPSE prior is trained on a mere 5 million image-text (alt-text) pairs.

> **_News:_**  Checkout our latest work, [λ-ECLIPSE](https://eclipse-t2i.github.io/Lambda-ECLIPSE/) extending the T2I priors for effecient zero-shot multi-subject driven text-to-image generations. 

**Please follow the below steps to run the inference locally.**

---

**Qualitative Comparisons:**

![Examples](./assets/example.png)

**Quantitative Comparisons:**

![Results](./assets/results.png)

## TODOs:

- [x] ~~Release ECLIPSE priors for Kandinsky v2.2 and Karlo-v1-alpha.~~

- [x] ~~Release the demo.~~

- [ ] Release ECLIPSE prior with Kandinsky v2.2 LCM decoder. (soon!)

- [ ] Release ECLIPSE prior training code. (will be released in seperate repository)

## Setup

### Installation

```bash

git clone [email protected]:eclipse-t2i/eclipse-inference.git

conda create -p ./venv python=3.9

pip install -r requirements.txt

```

### Demo

```bash

conda activate ./venv

gradio main.py

```

## Run Inference

This repository supports two pre-trained image decoders: [Karlo-v1-alpha](https://huggingface.co/kakaobrain/karlo-v1-alpha) and [Kandinsky-v2.2](https://huggingface.co/kandinsky-community/kandinsky-2-2-decoder).

**Note:** ECLIPSE prior is not a diffusion model -- while image decoders are.

### Kandinsky Inference

```python

from transformers import CLIPTextModelWithProjection, CLIPTokenizer

from src.pipelines.pipeline_kandinsky_prior import KandinskyPriorPipeline

from src.priors.prior_transformer import PriorTransformer

from diffusers import DiffusionPipeline

text_encoder = (

    CLIPTextModelWithProjection.from_pretrained(

        "laion/CLIP-ViT-bigG-14-laion2B-39B-b160k",

        projection_dim=1280,

        torch_dtype=torch.float32,

    )

) 

tokenizer = CLIPTokenizer.from_pretrained(

    "laion/CLIP-ViT-bigG-14-laion2B-39B-b160k"

)

prior = PriorTransformer.from_pretrained("ECLIPSE-Community/ECLIPSE_KandinskyV22_Prior")

pipe_prior = KandinskyPriorPipeline.from_pretrained("kandinsky-community/kandinsky-2-2-prior",    

  prior=prior,

  text_encoder=text_encoder,

  tokenizer=tokenizer,

).to("cuda")

pipe = DiffusionPipeline.from_pretrained("kandinsky-community/kandinsky-2-2-decoder").to("cuda")

prompt = "black apples in the basket"

image_embeds, negative_image_embeds = pipe_prior(prompt).to_tuple()

images = pipe(

    num_inference_steps=50,

    image_embeds=image_embeds,

    negative_image_embeds=negative_image_embeds,

).images

images[0]

```

### Karlo Inference

```python

from src.pipelines.pipeline_unclip import UnCLIPPipeline

from src.priors.prior_transformer import PriorTransformer

prior = PriorTransformer.from_pretrained("ECLIPSE-Community/ECLIPSE_Karlo_Prior")

pipe = UnCLIPPipeline.from_pretrained("kakaobrain/karlo-v1-alpha", prior=prior).to("cuda")

prompt="black apples in the basket"

images = pipe(prompt, decoder_guidance_scale=7.5).images

images[0]

```

# Acknowledgement

We would like to acknoweldge excellent open-source text-to-image models (Kalro and Kandinsky) without them this work would not have been possible. Also, we thank HuggingFace for streamlining the T2I models.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/eclipse-t2i/eclipse-inference

Awesome Lists containing this project

README