Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/nimadez/mental-diffusion

Stable diffusion command-line interface
https://github.com/nimadez/mental-diffusion

cli command-line diffusers headless img2img inpaint linux machine-learning realesrgan safetensors sdxl shell stable-diffusion terminal-based txt2img upscaler websockets

Last synced: about 2 months ago
JSON representation

Stable diffusion command-line interface

Awesome Lists containing this project

README

        

## Mental Diffusion

**Fast Stable Diffusion CLI**

Powered by [Diffusers](https://github.com/huggingface/diffusers)

Designed for Linux

| MDX | 0.9.0 |
| ------- | --- |
| Python | **3.12** - 3.11 |
| Torch | 2.3.1 +cu121 |
| Diffusers | 0.29.2 |
| + Gradio | 4.37.2 |

## Features
- SD, **SDXL**
- Load VAE and LoRA weights
- Txt2Img, Img2Img, Inpaint *(auto-pipeline)*
- TAESD latents preview *(image and animation)*
- Batch image generation, multiple images per prompt
- Read/write PNG metadata, auto-rename files
- CPU, GPU, Low VRAM mode *(auto mode)*
- Lightweight and fast, rewritten in **300** lines
- Proxy, offline mode, minimal downloads

> SD3 is currently not supported. [prototype](https://github.com/nimadez/mental-diffusion/blob/main/tests/sd3.py)

## Addons


All addons are based on Gradio and optional

Addons are not as thoroughly tested as the mdx.py script

| Name | Description | Screenshot |
| --- | --- | :---: |
| main | A tabbed interface for all addons | - |
| inference | The inference user-interface | [view](media/addon_inference.png) |
| preview | Watch preview and gallery | [view](media/addon_preview.png) |
| metadata | View and recreate data from PNG | [view](media/addon_metadata.png) |
| outpaint | Create image and mask for outpaint | [view](media/addon_outpaint.png) |
| upscale | Real-ESRGAN x2 and x4 plus | [view](media/addon_upscale.png) |

```
~/.venv/mdx/bin/python3 src/addons/addon-name.py
```

## Installation
> - Compatible with most diffusers-based python venvs
> - 3GB Python packages (5GB extracted)
> - 50MB Huggingface cache (automatic, mostly for taesd)
> - Make sure you have a swap partition or swap file
```
git clone https://github.com/nimadez/mental-diffusion
cd mental-diffusion

# Automatic installation for debian-based distributions:
apt install python3-pip python3-venv
sh install-venv.sh
sh install-bin.sh (optional)

# Manual installation:
python3 -m venv ~/.venv/mdx
source ~/.venv/mdx/bin/activate
pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu121
pip install -r ./requirements.txt
deactivate
```
```optional``` Install Gradio for addons:
```
~/.venv/mdx/bin/python3 -m pip install gradio==4.37.2
```
```optional``` Install Real-ESRGAN for upscaler addon:
```
~/.venv/mdx/bin/python3 -m pip install realesrgan
```
```optional``` Install Zenity for inference addon:
```
apt install zenity
```
> Without Zenity, you can't select safetensors files with the file dialog, you have to enter the Checkpoint, VAE and LoRA path manually.

## Arguments
```
~/.venv/mdx/bin/python3 mdx.py --help

--type -t str sd, xl (def: custom)
--checkpoint -c str /checkpoint.safetensors (def: custom)
--scheduler -sc str ddim, ddpm, euler, eulera, lcm, lms, pndm (def: custom)
--prompt -p str positive prompt
--negative -n str negative prompt
--width -w int divisible by 8 (def: custom)
--height -h int divisible by 8 (def: custom)
--seed -s int -1 randomize (def: -1)
--steps -st int 1 to 100+ (def: 24)
--guidance -g float 0 - 20.0+ (def: 8.0)
--strength -sr float 0 - 1.0 (def: 1.0)
--lorascale -ls float 0 - 1.0 (def: 1.0)
--image -i str /image.png
--mask -m str /mask.png
--vae -v str /vae.safetensors
--lora -l str /lora.safetensors
--filename -f str filename prefix without .png extension, add {seed} to be replaced (def: img_{seed})
--output -o str image and preview output directory (def: custom)
--number -no int number of images to generate per prompt (def: 1)
--batch -b int number of repeats to run in batch, --seed -1 to randomize
--preview -pv stepping is slower with preview enabled (def: no preview)
--lowvram -lv slower if you have enough VRAM, automatic on 4GB cards (def: no lowvram)
--metadata -meta str /image.png, extract metadata from png

[automatic pipeline]
Txt2Img: no --image and no --mask
Img2Img: --image and no --mask
Inpaint: --image and --mask
ERROR: no --image and --mask
```
```
Default: mdx -p "prompt" -st 28 -g 7.5
SD: mdx -t sd -c /checkpoint.safetensors -w 512 -h 512
SDXL: mdx -t xl -c /checkpoint.safetensors -w 768 -h 768
Img2Img: mdx -i /image.png -sr 0.5
Inpaint: mdx -i /image.png -m ./mask.png
VAE: mdx -v /vae.safetensors
LoRA: mdx -l /lora.safetensors -ls 0.5
Filename: mdx -f img_test_{seed}
Output: mdx -o /home/user/.mdx
Number: mdx -no 4
Batch: mdx -b 10
Preview: mdx -pv
Low VRAM: mdx -lv
Metadata: mdx -meta ./image.png
```

## Direct Inference
Import MDX class to inference from JSON data

```
from mdx import MDX

data = json.loads(data)
data["prompt"] = "new prompt"

parser = argparse.ArgumentParser()
args = parser.parse_args(namespace=argparse.Namespace(**data))

MDX().main(args)
```
> Inference can be interrupted by creating a file named ".interrupt" in the --output directory

## Tips & Tricks
* Enable OFFLINE if you have already downloaded the huggingface cache
* Enable SAVE_ANIM to save the preview animation to {output}/filename.webp
```
Preview, cancel, and repeat faster:
mdx -p "prompt" -g 8.0 -st 30 -pv
mdx -p "prompt" -g 8.0 -st 30 -s 827362763262387

Improve details with Img2Img pipeline:
mdx -p "prompt" -st 20 -f myimage
mdx -p "prompt" -st 30 -i ~/.mdx/image.png -sr 0.15

Content-aware upscaling: (ImageMagick, similar to A1111 hires-fix)
mdx -p "prompt" -st 20 -w 512 -h 512 -f image
magick convert ~/.mdx/image.png -resize 200% ~/.mdx/image_up.png
mdx -p "prompt" -st 20 -i ~/.mdx/image_up.png -sr 0.5

Generate 40 images in less time:
mdx -p "prompt" -b 10 -no 4

Extract images from WebP animation: (ImageMagick)
magick convert image.webp jpg

Create images across the LAN via SSH:
apt install openssh-server && ssh-keygen -t rsa -b 4096
ssh [email protected]
$ mdx -p "prompt"

Explore output directory in a browser across the LAN:
cd ~/.mdx && python3 -m http.server 8000
$ open http://192.168.x.x:8000

Download huggingface cache in a specific path:
mkdir ~/.hfcache && ln -s ~/.hfcache ~/.cache/huggingface
```

## Tests
| v0.9.0 | SD CPU | SD GPU | SDXL GPU |
| --- | :---: | :---: | :---: |
| Txt2Img | ✓ | ✓ | ✓ |
| Img2Img | ✓ | ✓ | ✓ |
| Inpaint | ✓ | ✓ | ✓ |
| VAE | ✓ | ✓ | ✓ |
| LoRA | ✓ | ✓ | ✓ |
| Batch | ✓ | ✓ | ✓ |
| Preview | ✓ | ✓ | ✓ |
| Low VRAM | | ✓ | ✓ |

- Debian Trixie (testing branch)
- Kernel 6.9.8
- Nvidia driver 535

## Previous Experiments

> - [Legacy command-line interface and server](https://github.com/nimadez/mental-diffusion/tree/main/legacy/README.md) (diffusers)
> - [ComfyUI bridge for VS Code extension](https://github.com/nimadez/mental-diffusion/tree/main/comfyui/README.md)

## History
```
↑ Gradio webui addons
↑ Rewritten in 300 lines
↑ Port to Linux
↑ Back to Diffusers
↑ Port to Code (webui)
↑ Change to ComfyUI API (webui)
↑ Created for personal use on Windows OS (diffusers)

"AI will bring us back to the age of terminals."
```

## License
Code released under the [MIT license](https://github.com/nimadez/mental-diffusion/blob/main/LICENSE).

## Credits
- [Hugging Face](https://huggingface.co/)
- [Diffusers](https://github.com/huggingface/diffusers)
- [PyTorch](https://pytorch.org/)
- [Stability-AI](https://github.com/Stability-AI)
- [TAESD](https://github.com/madebyollin/taesd)
- [Gradio](https://www.gradio.app/)
- [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN)

##### Models
- zavychromaxl_v80
- OpenDalleV1.1
- juggernaut_aftermath