https://github.com/sauravrwt/text-to-img
A Python-based text-to-image generation tool leveraging stable diffusion models. This application provides enhanced control over image generation through optimized parameters, improved prompting, and an efficient interface.
https://github.com/sauravrwt/text-to-img
diffuser fastai gradio python stable-diffusion torch transformers
Last synced: about 1 month ago
JSON representation
A Python-based text-to-image generation tool leveraging stable diffusion models. This application provides enhanced control over image generation through optimized parameters, improved prompting, and an efficient interface.
- Host: GitHub
- URL: https://github.com/sauravrwt/text-to-img
- Owner: SauRavRwT
- Created: 2025-02-10T12:23:06.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2025-03-22T04:03:50.000Z (about 1 year ago)
- Last Synced: 2025-03-22T05:18:22.833Z (about 1 year ago)
- Topics: diffuser, fastai, gradio, python, stable-diffusion, torch, transformers
- Language: Jupyter Notebook
- Homepage:
- Size: 1.43 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Text to Image Generator
A powerful Python-based text-to-image generation tool using Stable Diffusion models with optimized performance and an intuitive Gradio interface. Generate high-quality images from text descriptions with advanced customization options.

## Features
### Multiple Model Support
- Stable Diffusion 1.5
- Stable Diffusion 2.1
- Dreamshaper 8
### Smart Optimizations
- Automatic VRAM-based optimization
- xFormers memory efficient attention (when available)
- VAE slicing for better memory usage
- Sequential CPU offload for low VRAM systems
- Dynamic model caching
### Advanced Generation Controls
- Adaptive guidance scaling (7-30 range)
- Quality enhancement prompts
- Comprehensive negative prompts
- Custom seed support
- Adjustable image dimensions (512-1024px)
- Configurable inference steps (20-100)
### User Interface
- Clean, intuitive Gradio interface
- Example prompts gallery
- Real-time generation details
- Advanced settings accordion
- Random prompt generation
- Clear function
### Results
Below are sample outputs generated by the Text to Image Generator using various prompts and models:
| Prompt | Model | Output |
|--------|-------|--------|
| *A serene landscape with mountains, a lake, and a sunset, highly detailed, vibrant colors* | Stable Diffusion 1.5 |  |
| *A serene landscape with mountains, a lake, and a sunset, highly detailed, vibrant colors* | Stable Diffusion 2.1 |  |
| *A serene landscape with mountains, a lake, and a sunset, highly detailed, vibrant colors* | Dreamshaper 8 |  |
> *Note: Output quality and style may vary depending on the selected model and prompt.*
## Requirements
- Python 3.13
- CUDA-compatible GPU (recommended)
- CPU-only mode supported
## Setup Guide
### Google Colab Setup
1. Open `Text-to-img.ipynb` in Google Colab
2. Select 'Runtime' > 'Change runtime type' > Choose 'T4 GPU'
3. Run all cells in sequence
### Hugging Face Token Setup
#### Google Colab
1. Obtain your Hugging Face token from [Hugging Face](https://huggingface.co/settings/tokens).
2. Add the token to the Colab notebook by running the following code in a cell:
```python
from huggingface_hub import login
login("your_huggingface_token_here")
```
#### Local Deployment
1. Obtain your Hugging Face token from [Hugging Face](https://huggingface.co/settings/tokens).
2. Create a file named `token` inside the `huggingface` directory:
```bash
echo "your_huggingface_token_here" > huggingface/token
```
Alternatively, you can securely add your Hugging Face token directly in Google Colab by setting it as a secret. Use the following code snippet in a cell:
```python
HF_TOKEN=YOUR_HUGGINGFACE_TOKEN
```
### Local Installation
Note: Need Python 3.13 and if you are facing error in setup wheel then run `pip install --upgrade pip setuptools wheel` and restart the kernel.
#### 1. Clone Repository
```bash
git clone https://github.com/SauRavRwT/text-to-img.git
cd text-to-img
```
#### 2. Create Virtual Environment
Windows:
```bash
python -m venv venv
venv\scripts\activate
```
Linux/Mac:
```bash
python -m venv venv
source venv/bin/activate
```
#### 3. Install Dependencies
```bash
pip install -r requirements.txt
```
#### 4. Launch Application
```bash
python app.py
```
### Technical Notes
- The system automatically detects available VRAM and applies appropriate optimizations
- For systems with less than 6GB VRAM: Uses sequential CPU offload
- For systems with 6-8GB VRAM: Uses model CPU offload
- For systems with >8GB VRAM: Runs fully on GPU with memory optimizations
### For CUDA out of memory errors:
- Reduce image dimensions
- Decrease inference steps
- Use a lower resolution model