https://github.com/cj-mills/cjm-diffusers-utils

Some utility functions I frequently use with 🤗 diffusers.
https://github.com/cj-mills/cjm-diffusers-utils

Last synced: 7 months ago
JSON representation

Some utility functions I frequently use with 🤗 diffusers.

Host: GitHub
URL: https://github.com/cj-mills/cjm-diffusers-utils
Owner: cj-mills
License: mit
Created: 2023-01-24T00:47:47.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-02-10T02:57:25.000Z (over 2 years ago)
Last Synced: 2025-02-13T15:49:32.138Z (8 months ago)
Language: Jupyter Notebook
Homepage: https://cj-mills.github.io/cjm-diffusers-utils/
Size: 9.21 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          cjm-diffusers-utils

================

## Install

``` sh

pip install cjm_diffusers_utils

```

## How to use

``` python

import torch

from cjm_pytorch_utils.core import get_torch_device

device = get_torch_device()

dtype = torch.float16 if device == 'cuda' else torch.float32

device, dtype

```

    ('cuda', torch.float16)

### pil_to_latent

``` python

from cjm_diffusers_utils.core import pil_to_latent

from PIL import Image

from diffusers import AutoencoderKL

```

``` python

model_name = "stabilityai/stable-diffusion-2-1"

vae = AutoencoderKL.from_pretrained(model_name, subfolder="vae").to(device=device, dtype=dtype)

```

``` python

img_path = img_path = '../images/cat.jpg'

src_img = Image.open(img_path).convert('RGB')

print(f"Source Image Size: {src_img.size}")

img_latents = pil_to_latent(src_img, vae)

print(f"Latent Dimensions: {img_latents.shape}")

```

    Source Image Size: (768, 512)

    Latent Dimensions: torch.Size([1, 4, 64, 96])

### latent_to_pil

``` python

from cjm_diffusers_utils.core import latent_to_pil

```

``` python

decoded_img = latent_to_pil(img_latents, vae)

print(f"Decoded Image Size: {decoded_img.size}")

```

    Decoded Image Size: (768, 512)

### text_to_emb

``` python

from cjm_diffusers_utils.core import text_to_emb

from transformers import CLIPTextModel, CLIPTokenizer

```

``` python

# Load the tokenizer for the specified model

tokenizer = CLIPTokenizer.from_pretrained(model_name, subfolder="tokenizer")

# Load the text encoder for the specified model

text_encoder = CLIPTextModel.from_pretrained(model_name, subfolder="text_encoder").to(device=device, dtype=dtype)

```

``` python

prompt = "A cat sitting on the floor."

text_emb = text_to_emb(prompt, tokenizer, text_encoder)

text_emb.shape

```

    torch.Size([2, 77, 1024])

### prepare_noise_scheduler

``` python

from cjm_diffusers_utils.core import prepare_noise_scheduler

from diffusers import DEISMultistepScheduler

```

``` python

noise_scheduler = DEISMultistepScheduler.from_pretrained(model_name, subfolder='scheduler')

print(f"Number of timesteps: {len(noise_scheduler.timesteps)}")

print(noise_scheduler.timesteps[:10])

noise_scheduler = prepare_noise_scheduler(noise_scheduler, 70, 1.0)

print(f"Number of timesteps: {len(noise_scheduler.timesteps)}")

print(noise_scheduler.timesteps[:10])

```

    Number of timesteps: 1000

    tensor([999., 998., 997., 996., 995., 994., 993., 992., 991., 990.])

    Number of timesteps: 70

    tensor([999, 985, 970, 956, 942, 928, 913, 899, 885, 871])

### prepare_depth_mask

``` python

from cjm_diffusers_utils.core import prepare_depth_mask

```

``` python

depth_map_path = '../images/depth-cat.png'

depth_map = Image.open(depth_map_path)

print(f"Depth map size: {depth_map.size}")

depth_mask = prepare_depth_mask(depth_map).to(device=device, dtype=dtype)

depth_mask.shape, depth_mask.min(), depth_mask.max()

```

    Depth map size: (768, 512)

    (torch.Size([1, 1, 64, 96]),

     tensor(-1., device='cuda:0', dtype=torch.float16),

     tensor(1., device='cuda:0', dtype=torch.float16))

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/cj-mills/cjm-diffusers-utils

Awesome Lists containing this project

README