An open API service indexing awesome lists of open source software.

https://github.com/sauravrwt/text-to-img

A Python-based text-to-image generation tool leveraging stable diffusion models. This application provides enhanced control over image generation through optimized parameters, improved prompting, and an efficient interface.
https://github.com/sauravrwt/text-to-img

diffuser fastai gradio python stable-diffusion torch transformers

Last synced: about 1 month ago
JSON representation

A Python-based text-to-image generation tool leveraging stable diffusion models. This application provides enhanced control over image generation through optimized parameters, improved prompting, and an efficient interface.

Awesome Lists containing this project

README

          

# Text to Image Generator

A powerful Python-based text-to-image generation tool using Stable Diffusion models with optimized performance and an intuitive Gradio interface. Generate high-quality images from text descriptions with advanced customization options.

![Text to Image Generator](./Images/txt-to-img-5.png)

## Features

### Multiple Model Support
- Stable Diffusion 1.5
- Stable Diffusion 2.1
- Dreamshaper 8

### Smart Optimizations
- Automatic VRAM-based optimization
- xFormers memory efficient attention (when available)
- VAE slicing for better memory usage
- Sequential CPU offload for low VRAM systems
- Dynamic model caching

### Advanced Generation Controls
- Adaptive guidance scaling (7-30 range)
- Quality enhancement prompts
- Comprehensive negative prompts
- Custom seed support
- Adjustable image dimensions (512-1024px)
- Configurable inference steps (20-100)

### User Interface
- Clean, intuitive Gradio interface
- Example prompts gallery
- Real-time generation details
- Advanced settings accordion
- Random prompt generation
- Clear function

### Results

Below are sample outputs generated by the Text to Image Generator using various prompts and models:

| Prompt | Model | Output |
|--------|-------|--------|
| *A serene landscape with mountains, a lake, and a sunset, highly detailed, vibrant colors* | Stable Diffusion 1.5 | ![Sample 3](./Images/sd-v1.5.webp) |
| *A serene landscape with mountains, a lake, and a sunset, highly detailed, vibrant colors* | Stable Diffusion 2.1 | ![Sample 1](./Images/sd-v2.1.webp) |
| *A serene landscape with mountains, a lake, and a sunset, highly detailed, vibrant colors* | Dreamshaper 8 | ![Sample 2](./Images/dreamshaper-v8.webp) |

> *Note: Output quality and style may vary depending on the selected model and prompt.*

## Requirements

- Python 3.13
- CUDA-compatible GPU (recommended)
- CPU-only mode supported

## Setup Guide

### Google Colab Setup
1. Open `Text-to-img.ipynb` in Google Colab
2. Select 'Runtime' > 'Change runtime type' > Choose 'T4 GPU'
3. Run all cells in sequence

### Hugging Face Token Setup

#### Google Colab
1. Obtain your Hugging Face token from [Hugging Face](https://huggingface.co/settings/tokens).
2. Add the token to the Colab notebook by running the following code in a cell:
```python
from huggingface_hub import login
login("your_huggingface_token_here")
```

#### Local Deployment
1. Obtain your Hugging Face token from [Hugging Face](https://huggingface.co/settings/tokens).
2. Create a file named `token` inside the `huggingface` directory:
```bash
echo "your_huggingface_token_here" > huggingface/token
```
Alternatively, you can securely add your Hugging Face token directly in Google Colab by setting it as a secret. Use the following code snippet in a cell:

```python
HF_TOKEN=YOUR_HUGGINGFACE_TOKEN
```

### Local Installation
Note: Need Python 3.13 and if you are facing error in setup wheel then run `pip install --upgrade pip setuptools wheel` and restart the kernel.

#### 1. Clone Repository
```bash
git clone https://github.com/SauRavRwT/text-to-img.git
cd text-to-img
```

#### 2. Create Virtual Environment
Windows:
```bash
python -m venv venv
venv\scripts\activate
```

Linux/Mac:
```bash
python -m venv venv
source venv/bin/activate
```

#### 3. Install Dependencies
```bash
pip install -r requirements.txt
```

#### 4. Launch Application
```bash
python app.py
```

### Technical Notes
- The system automatically detects available VRAM and applies appropriate optimizations
- For systems with less than 6GB VRAM: Uses sequential CPU offload
- For systems with 6-8GB VRAM: Uses model CPU offload
- For systems with >8GB VRAM: Runs fully on GPU with memory optimizations

### For CUDA out of memory errors:

- Reduce image dimensions
- Decrease inference steps
- Use a lower resolution model