An open API service indexing awesome lists of open source software.

https://github.com/malith-rukshan/whisper-transcriber-bot

๐ŸŽ™๏ธ AI-powered Telegram bot for voice-to-text transcription using OpenAI Whisper. CPU-only, no GPU required, privacy-focused with local processing.
https://github.com/malith-rukshan/whisper-transcriber-bot

ai-transcription no-gpu openai-whisper self-hosted speech-recognition speech-to-text whisper-ai whisper-cpp

Last synced: 3 months ago
JSON representation

๐ŸŽ™๏ธ AI-powered Telegram bot for voice-to-text transcription using OpenAI Whisper. CPU-only, no GPU required, privacy-focused with local processing.

Awesome Lists containing this project

README

          


TranscriberXBOT


๐ŸŽ™๏ธ Whisper Transcriber Bot


[![Telegram](https://img.shields.io/badge/Telegram-Demo-01CC1D?logo=telegram&style=flat)](https://t.me/TranscriberXBOT)
[![Docker](https://img.shields.io/badge/Docker-Ready-2D80E3?logo=docker&style=flat)](https://hub.docker.com/r/malithrukshan/whisper-transcriber-bot)
![License](https://img.shields.io/badge/License-MIT-green.svg)

โœจ Transform voice into text instantly with AI-powered transcription magic! โœจ


- A self-hosted, privacy-focused transcription for Telegram -


๐Ÿš€ No GPU Required โ€ข No API Keys โ€ข CPU-Only โ€ข Low Resource Usage ใƒ„

## โœจ Features

- ๐ŸŽ™๏ธ **Voice Transcription** - Convert voice messages to text instantly
- ๐ŸŽต **Multi-Format Support** - MP3, M4A, WAV, OGG, FLAC audio files
- โšก **Concurrent Processing** - Handle multiple users simultaneously
- ๐Ÿ“ **Smart Text Handling** - Auto-generate text files for long transcriptions
- ๐Ÿง  **AI-Powered** - OpenAI Whisper model for accurate transcription
- ๐Ÿ’ป **CPU-Only Processing** - No GPU required, runs on basic servers (512MB RAM minimum)
- ๐Ÿšซ **No API Dependencies** - No external API keys or cloud services needed
- ๐Ÿณ **Docker Ready** - Easy deployment with containerization
- ๐Ÿ”’ **Privacy Focused** - Process audio locally, complete data privacy
- ๐Ÿ’ฐ **Cost Effective** - Ultra-low resource usage, perfect for budget hosting

## ๐ŸŽฌ Demo

Whisper Transcriber Bot Demo

## ๐Ÿš€ One-Click Deploy

[![Deploy to Heroku](https://www.herokucdn.com/deploy/button.svg)](https://heroku.com/deploy)
[![Deploy to Render](https://render.com/images/deploy-to-render-button.svg)](https://render.com/deploy)

[![Deploy to DO](https://www.deploytodo.com/do-btn-blue.svg)](https://cloud.digitalocean.com/apps/new?repo=https://github.com/Malith-Rukshan/whisper-transcriber-bot/tree/main)

*Deploy instantly to your favorite cloud platform with pre-configured settings! All platforms support CPU-only deployment - no GPU needed!*

## ๐Ÿ“ Quick Start

### Prerequisites

- Docker and Docker Compose
- Telegram Bot Token ([Create Bot](https://t.me/botfather))

### Installation

1. **Clone the repository**
```bash
git clone https://github.com/Malith-Rukshan/whisper-transcriber-bot.git
cd whisper-transcriber-bot
```

2. **Configure environment**
```bash
cp .env.example .env
nano .env # Add your bot token
```

3. **Download AI model**
```bash
chmod +x download_model.sh
./download_model.sh
```

4. **Deploy with Docker**
```bash
docker-compose up -d
```

๐ŸŽ‰ **That's it!** Your bot is now running and ready to transcribe audio!

## ๐Ÿ“‹ Usage

### Bot Commands

| Command | Description |
|---------|-------------|
| `/start` | ๐Ÿ  Welcome message and bot introduction |
| `/help` | ๐Ÿ“– Detailed usage instructions |
| `/about` | โ„น๏ธ Bot information and developer details |
| `/status` | ๐Ÿ” Check bot health and configuration |

### How to Use

1. **Send Audio** ๐ŸŽ™๏ธ - Forward voice messages or upload audio files
2. **Wait for AI** โณ - Bot processes audio (typically 1-3 seconds)
3. **Get Text** ๐Ÿ“ - Receive transcription or download text file for long content

### Supported Formats

- **Voice Messages** - Direct Telegram voice notes
- **Audio Files** - MP3, M4A, WAV, OGG, FLAC (up to 50MB)
- **Document Audio** - Audio files sent as documents

## ๐Ÿณ Docker Deployment

### Cloud Deployment (Recommended)

Perfect for cloud platforms like Render, Railway, etc. The model is included in the image.

```yaml
version: '3.8'
services:
whisper-bot:
image: malithrukshan/whisper-transcriber-bot:latest
container_name: whisper-transcriber-bot
restart: unless-stopped
environment:
- TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN}
```

### Local Development with Volume Mount

For local development where you want to persist models between container rebuilds:

```yaml
version: '3.8'
services:
whisper-bot:
image: malithrukshan/whisper-transcriber-bot:latest
container_name: whisper-transcriber-bot
restart: unless-stopped
environment:
- TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN}
volumes:
- ./models:/app/models
```

### Using Docker CLI

```bash
# Cloud deployment (model included in image)
docker run -d \
--name whisper-bot \
-e TELEGRAM_BOT_TOKEN=your_token_here \
malithrukshan/whisper-transcriber-bot:latest

# Local development (with volume mount)
docker run -d \
--name whisper-bot \
-e TELEGRAM_BOT_TOKEN=your_token_here \
-v $(pwd)/models:/app/models \
malithrukshan/whisper-transcriber-bot:latest
```

## ๐Ÿ› ๏ธ Development

### Local Development Setup

```bash
# Clone and setup
git clone https://github.com/Malith-Rukshan/whisper-transcriber-bot.git
cd whisper-transcriber-bot

# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt
pip install -r requirements-dev.txt # Development dependencies

# Download model
./download_model.sh

# Configure environment
cp .env.example .env
# Add your bot token to .env

# Run bot
python src/bot.py
```

### Project Structure

```
whisper-transcriber-bot/
โ”œโ”€โ”€ src/ # Source code
โ”‚ โ”œโ”€โ”€ bot.py # Main bot application
โ”‚ โ”œโ”€โ”€ transcriber.py # Whisper integration
โ”‚ โ”œโ”€โ”€ config.py # Configuration management
โ”‚ โ””โ”€โ”€ utils.py # Utility functions
โ”œโ”€โ”€ tests/ # Test files
โ”‚ โ”œโ”€โ”€ test_bot.py # Bot functionality tests
โ”‚ โ””โ”€โ”€ test_utils.py # Utility function tests
โ”œโ”€โ”€ .github/workflows/ # CI/CD automation
โ”œโ”€โ”€ models/ # AI model storage
โ”œโ”€โ”€ Dockerfile # Container configuration
โ”œโ”€โ”€ docker-compose.yml # Deployment setup
โ”œโ”€โ”€ requirements.txt # Production dependencies
โ”œโ”€โ”€ requirements-dev.txt # Development dependencies
โ””โ”€โ”€ README.md # This file
```

## ๐Ÿงช Testing

### Running Tests

```bash
# Run all tests
python -m pytest

# Run with coverage
python -m pytest --cov=src

# Run specific test file
python -m pytest tests/test_bot.py

# Run with verbose output
python -m pytest -v
```

### Test Coverage

The test suite covers:
- โœ… Bot initialization and configuration
- โœ… Command handlers (`/start`, `/help`, `/about`, `/status`)
- โœ… Audio processing workflow
- โœ… Utility functions
- โœ… Error handling scenarios

### Code Quality

```bash
# Format code
black src/ tests/

# Security check
bandit -r src/
```

## ๐Ÿ”ง Configuration

### Environment Variables

| Variable | Description | Default |
|----------|-------------|---------|
| `TELEGRAM_BOT_TOKEN` | Bot token from @BotFather | Required |
| `WHISPER_MODEL_PATH` | Path to Whisper model file | `models/ggml-base.en.bin` |
| `WHISPER_MODEL_NAME` | Model name for display | `base.en` |
| `BOT_USERNAME` | Bot username for branding | `TranscriberXBOT` |
| `MAX_AUDIO_SIZE_MB` | Maximum audio file size | `50` |
| `SUPPORTED_FORMATS` | Supported audio formats | `mp3,m4a,wav,ogg,flac` |
| `LOG_LEVEL` | Logging verbosity | `INFO` |

### Performance Tuning

```bash
# Adjust CPU threads for transcription
export WHISPER_THREADS=4

# Set memory limits
export WHISPER_MAX_MEMORY=512M

# Configure concurrent processing
export MAX_CONCURRENT_TRANSCRIPTIONS=5
```

## ๐Ÿ“Š Performance Metrics

| Audio Length | Processing Time | Memory Usage |
|--------------|----------------|--------------|
| 30 seconds | ~1.2 seconds | ~180MB |
| 2 minutes | ~2.8 seconds | ~200MB |
| 5 minutes | ~6.1 seconds | ~220MB |

### Scaling Recommendations

- **Single Instance**: Handles 50+ concurrent users
- **Minimal Resources**: 1 CPU core, 512MB RAM minimum (no GPU required!)
- **Storage**: 1GB for model + temporary files
- **Cost-Effective**: Perfect for budget VPS hosting ($5-10/month)
- **No External APIs**: Zero ongoing API costs or dependencies
- **Load Balancing**: Deploy multiple instances for higher traffic

## ๐Ÿค Contributing

We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.

### Quick Contribution Steps

1. **Fork** the repository
2. **Create** a feature branch (`git checkout -b feature/amazing-feature`)
3. **Commit** your changes (`git commit -m 'Add amazing feature'`)
4. **Push** to the branch (`git push origin feature/amazing-feature`)
5. **Open** a Pull Request

### Development Guidelines

- Follow PEP 8 style guide
- Write tests for new features
- Update documentation
- Ensure Docker build succeeds
- Run quality checks before PR

## ๐Ÿ“ˆ Technical Architecture

### Core Components

- **Framework**: [python-telegram-bot](https://github.com/python-telegram-bot/python-telegram-bot) v22.2
- **AI Model**: [OpenAI Whisper](https://github.com/openai/whisper) base.en (147MB, CPU-optimized)
- **Bindings**: [pywhispercpp](https://github.com/aarnphm/pywhispercpp) for C++ performance (no GPU needed)
- **Runtime**: Python 3.11 with asyncio for concurrent processing
- **Container**: Multi-architecture Docker (AMD64/ARM64)
- **Resource**: CPU-only inference, minimal memory footprint

### Performance Features

- **Async Processing**: Non-blocking audio transcription
- **Concurrent Handling**: Multiple users supported simultaneously
- **Memory Management**: Efficient model loading and cleanup
- **Error Recovery**: Robust error handling and logging

## ๐Ÿ“„ License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## ๐Ÿ™ Acknowledgments

- **OpenAI** - For the incredible Whisper speech recognition model
- **pywhispercpp** - High-performance Python bindings for whisper.cpp
- **python-telegram-bot** - Excellent Telegram Bot API framework
- **whisper.cpp** - Optimized C++ implementation of Whisper

## ๐Ÿ‘จโ€๐Ÿ’ป Developer

**Malith Rukshan**
- ๐ŸŒ Website: [malith.dev](https://malith.dev)
- ๐Ÿ“ง Email: hello@malith.dev
- ๐Ÿฆ Telegram: [@MalithRukshan](https://t.me/MalithRukshan)

---

### โญ Star History

[![Star History Chart](https://api.star-history.com/svg?repos=Malith-Rukshan/whisper-transcriber-bot&type=Date)](https://star-history.com/#Malith-Rukshan/whisper-transcriber-bot&Date)

**If this project helped you, please consider giving it a โญ!**

Made with โค๏ธ by [Malith Rukshan](https://malith.dev)

[๐Ÿš€ Try the Bot](https://t.me/TranscriberXBOT) โ€ข [โญ Star on GitHub](https://github.com/Malith-Rukshan/whisper-transcriber-bot) โ€ข [๐Ÿณ Docker Hub](https://hub.docker.com/r/malithrukshan/whisper-transcriber-bot)