https://github.com/derogab/yt-transcript
A Dockerized Telegram bot that downloads YouTube videos as MP3 audio and transcribes them using OpenAI Whisper
https://github.com/derogab/yt-transcript
openai-whisper telegram telegram-bot whisper youtube youtube-transcript
Last synced: 11 months ago
JSON representation
A Dockerized Telegram bot that downloads YouTube videos as MP3 audio and transcribes them using OpenAI Whisper
- Host: GitHub
- URL: https://github.com/derogab/yt-transcript
- Owner: derogab
- License: mit
- Created: 2025-07-12T20:17:04.000Z (11 months ago)
- Default Branch: master
- Last Pushed: 2025-07-13T00:31:00.000Z (11 months ago)
- Last Synced: 2025-07-13T02:37:53.175Z (11 months ago)
- Topics: openai-whisper, telegram, telegram-bot, whisper, youtube, youtube-transcript
- Language: Python
- Homepage:
- Size: 1.1 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
YT-Transcript
A Dockerized Telegram bot that downloads YouTube videos as MP3 audio and transcribes them using OpenAI Whisper
### Features
- 🤖 Telegram bot interface
- 🎵 Download YouTube videos as MP3 audio
- 📝 Transcribe audio using OpenAI Whisper
- 🔄 Automatic cleanup of temporary files
- 📱 Support for long transcriptions (split into multiple messages)
- 🐳 Docker containerized for easy deployment
### Quick Start
1. **Clone the repository:**
```bash
git clone
cd yt-transcript
```
2. **Create environment file:**
```bash
# Create .env file
echo "TELEGRAM_TOKEN=your_telegram_bot_token_here" > .env
echo "WHISPER_MODEL=base" >> .env
```
3. **Run with Docker Compose:**
```bash
docker-compose up -d
```
### Environment Variables
| Variable | Description | Default |
|----------|-------------|---------|
| `TELEGRAM_TOKEN` | Your Telegram bot token (required) | - |
| `WHISPER_MODEL` | Whisper model size (tiny, base, small, medium, large) | base |
### Manual Docker Build
```bash
# Build the image
docker build -t yt-transcript-bot .
# Run the container
docker run --env-file .env yt-transcript-bot
```
### Usage
1. Start the bot using Docker
2. Open your Telegram bot and send a YouTube URL
3. The bot will:
- Download the video as MP3
- Transcribe the audio using Whisper
- Send you the transcription
### Supported YouTube URL Formats
- `https://www.youtube.com/watch?v=VIDEO_ID`
- `https://youtu.be/VIDEO_ID`
- `https://youtube.com/embed/VIDEO_ID`
- `https://youtube.com/v/VIDEO_ID`
### Bot Commands
- `/start` - Show welcome message and instructions
- `/help` - Show detailed help information
### Configuration
##### Whisper Models
You can change the Whisper model size by setting the `WHISPER_MODEL` environment variable:
- `tiny` - Fastest, least accurate
- `base` - Good balance (default)
- `small` - Better accuracy
- `medium` - High accuracy
- `large` - Best accuracy, slowest
##### Example .env file:
```env
TELEGRAM_TOKEN=1234567890:ABCdefGHIjklMNOpqrsTUVwxyz
WHISPER_MODEL=base
```
##### Docker Commands
```bash
# Start the bot
docker-compose up -d
# View logs
docker-compose logs -f
# Stop the bot
docker-compose down
# Rebuild and restart
docker-compose up -d --build
```
### Tip
If you like this project or directly benefit from it, please consider buying me a coffee:
🔗 `bc1qd0qatgz8h62uvnr74utwncc6j5ckfz2v2g4lef`
⚡️ `derogab@sats.mobi`
💶 [Sponsor on GitHub](https://github.com/sponsors/derogab)
### Credits
_YT-Transcript_ is made with ♥ by [derogab](https://github.com/derogab) and it's released under the [MIT license](./LICENSE).