https://github.com/laion-ai/conditioned-prior

(wip) Use LAION-AI's CLIP "conditoned prior" to generate CLIP image embeds from CLIP text embeds.
https://github.com/laion-ai/conditioned-prior

Last synced: about 1 year ago
JSON representation

(wip) Use LAION-AI's CLIP "conditoned prior" to generate CLIP image embeds from CLIP text embeds.

Host: GitHub
URL: https://github.com/laion-ai/conditioned-prior
Owner: LAION-AI
License: mit
Created: 2022-07-05T10:02:17.000Z (almost 4 years ago)
Default Branch: main
Last Pushed: 2022-07-14T15:55:12.000Z (almost 4 years ago)
Last Synced: 2025-05-07T18:13:52.744Z (about 1 year ago)
Language: Python
Homepage:
Size: 26.4 KB
Stars: 27
Watchers: 5
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Conditioned Prior (WIP)

Weights and code by [@nousr](https://twitter.com/nousr_)

Predict a CLIP image embedding from its text embedding using a diffusion prior.

This code is part of an effort to replicate the models laid out in [Hierarchical Text-Conditional Image Generation with CLIP Latents](https://arxiv.org/abs/2204.06125).

## Requirements

* nvidia GPU
* [docker](https://docs.docker.com/get-docker/)
* [cog](https://github.com/replicate/cog/)

## Quick start

```sh
cog predict r8.im/laion-ai/conditioned-prior \
-i prompt="..." \
-i candidates=2 \
-i cond_scale=1.0 \
-i overwrite=False
```

### Parameters

* `prompt` - Text to invert to a CLIP image embed (Required)

* `cond_scale` - How much prior guidance to use. (Default 1.0)

* `candidates` - Number of image embeds to draw from in the prior. Increasing may improve performance. (Default: 2)

* `overwrite` - Recomputes all embeds from scratch, even if they already exist on local storage. (Default: False)

### Output

A `PriorOutput` - typed dictionary containing the text tokens, text embed and the image embed.

* `text_embedding: List[float]` - CLIP embed of your text, included as convenience for cosine similarity.

* `image_embedding: List[float]` - CLIP image embedding inverted from the text embedding using the conditoned-prior model.

Outputs are also stored as numpy arrays in the current directory at `f"./.embed_cache/text_embedding_{prompt}.npy"`
and `f"./.embed_cache/image_embedding_{prompt}.npy"`

## Intended use

Anytime you need a CLIP image embed but only have a text description. For instance:

* Use as input to models that accept CLIP embeds such as CLIP-guided VQGAN, diffusion to improve generations.

* Use to improve performance on lookup tasks

## Special Thanks

* [LAION](https://discord.gg/uPMftTmrvS) for support, resources, and community

* [Stability AI](https://stability.ai/) for compute which makes these models possible

* [lucidrains](https://github.com/lucidrains) for spearheading the open-source replication of DALLE 2

## Caveats and recommendations

Just to avoid any confusion, this research is a recreation of (one part of) OpenAI's DALLE2 paper. It is _not_, "DALLE2", the product/service from OpenAI you may have seen on the web.

## Contribute

* Install [docker](https://docs.docker.com/get-docker/).
* Install [cog](https://github.com/replicate/cog/).

```sh
git clone https://github.com/laion-ai/conditoned-prior.git && cd conditioned-prior
```

### Build the docker image from scratch

Download the "slim" weights:

```sh
wget https://huggingface.co/laion/DALLE2-PyTorch/resolve/main/prior_ema_fp16.pth
```

Then, run:

```sh
cog build -t "my-custom-conditioned-prior"
```

### Local prediction flask endpoint

```sh
docker run -d -p 5000:5000 --gpus=all 'my-custom-conditioned-prior'
```

A `POST` route `/predictions` will now trigger the model to be run. Weights are only loaded into GPU memory once upon running `docker run`, making repeated API calls faster.

```sh
curl http://localhost:5000/predictions -X POST -H "Content-Type: application/json" \
-d '{"input": {
"prompt": "...",
"candidates": "2",
"cond_scale": "1.0"
}}'
```

### Push a fork to your own Replicate account

First, edit the `image` property in [cog.yaml](/cog.yaml)

```yaml
# ...
image: "" # TODO put your own url here after creating a model on Replicate.
build:
gpu: true
python_version: "3.8"
# ...
```

Make sure you are logged in:

```sh
cog login
```

and push your docker image to Replicate:

```sh
cog push
```

### Update the official laion-ai Replicate demo/api
If you need to change the Replicate demo uploaded to `replicate.com/laion-ai/conditioned-prior`, you will need to be invited to be part of the laion-ai org on Replicate. Reach out to @afiaka87, @robvanvolt, @christophschuhmann, or @rom1504 if you need to.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/laion-ai/conditioned-prior

Awesome Lists containing this project

README