https://github.com/itsmevictor/youtube-to-text
A simple CLI to transcribe YouTube videos, clean the extracted text, and produce easy-to-use transcripts for analysis, reading, or subtitles.
https://github.com/itsmevictor/youtube-to-text
cli llm openai-whisper speech-to-text transcription whisper youtube
Last synced: 3 months ago
JSON representation
A simple CLI to transcribe YouTube videos, clean the extracted text, and produce easy-to-use transcripts for analysis, reading, or subtitles.
- Host: GitHub
- URL: https://github.com/itsmevictor/youtube-to-text
- Owner: itsmevictor
- License: mit
- Created: 2025-06-22T18:53:49.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2025-06-24T14:22:55.000Z (3 months ago)
- Last Synced: 2025-06-24T15:36:46.889Z (3 months ago)
- Topics: cli, llm, openai-whisper, speech-to-text, transcription, whisper, youtube
- Language: Python
- Homepage:
- Size: 40 KB
- Stars: 7
- Watchers: 0
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# YouTube Transcriber
A command-line tool to turn any YouTube video into a clean, readable text transcript. It uses OpenAI's Whisper for transcription and the LLM of your choice to automatically clean and reformat the output.
## Features
1. **Automatic download** from YouTube
1. **Fast and accurate transcription** using OpenAI's Whisper models
2. **LLM-powered cleaning** that removes filler words, fixes grammar, and organizes content into readable paragraphs
3. **Multiple output formats** (TXT, SRT, VTT) for any use case
4. **Flexible LLM support** - use Gemini, ChatGPT, Claude or any other (local) LLM for cleaning## Quick Start
```bash
# Basic usage - transcribe and clean
python main.py "https://www.youtube.com/watch?v=VIDEO_ID"# Create clean subtitles
python main.py "https://www.youtube.com/watch?v=VIDEO_ID" -f srt -o subtitles.srt
```## Installation
**Option 1: Clone and run**
```bash
git clone https://github.com/itsmevictor/youtube-to-text
cd youtube-to-text
pip install -r requirements.txt
python main.py "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
```**Option 2: Install as package**
```bash
git clone https://github.com/itsmevictor/youtube-to-text
cd youtube-to-text
pip install -e .
youtube-transcribe "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
```**Requirements:**
- Python 3.7+
- FFmpeg (for audio processing)
- LLM API key (for cleaning, optional but recommended)## Usage Examples
**Basic transcription with cleaning:**
```bash
python main.py "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
```**Create clean subtitles:**
```bash
python main.py "https://www.youtube.com/watch?v=dQw4w9WgXcQ" -f srt -o subtitles.srt
```**High-quality lecture transcription:**
```bash
python main.py "https://www.youtube.com/watch?v=dQw4w9WgXcQ" \
-m large \
--llm-model gemini-2.0-flash-exp \
--cleaning-style lecture \
--save-raw
```**Raw transcript (no cleaning):**
```bash
python main.py "https://www.youtube.com/watch?v=dQw4w9WgXcQ" --no-clean
```## Configuration
### Key Options
- `--format, -f`: Output format (txt, srt, vtt)
- `--model, -m`: Whisper model (tiny, base, small, medium, large, turbo)
- `--llm-model`: LLM for cleaning (gemini-2.0-flash-exp, gpt-4o-mini, etc.)
- `--cleaning-style`: presentation, conversation, or lecture
- `--save-raw`: Keep both raw and cleaned versions
- `--no-clean`: Skip AI cleaning### Whisper Models
| Model | Speed | Accuracy | Size | Notes |
|-------|-------|----------|------|-------|
| tiny | Fastest | Basic | ~39 MB | Quick transcripts |
| base | Fast | Good | ~74 MB | Balanced option |
| small | Moderate | Good | ~244 MB | Good for most use cases |
| large | Slow | Best | ~1550 MB | Highest quality |## LLM-Powered Cleaning Setup
### Quick Setup (Recommended)
```bash
# Install and configure Gemini (fast + cost-effective)
llm install llm-gemini
llm keys set gemini
# Enter your Gemini API key when prompted
```### Alternative Providers
```bash
# OpenAI
llm keys set openai# Anthropic Claude
llm install llm-claude-3
llm keys set claude
```**Popular models:**
- `gemini-2.0-flash-exp` (recommended - fast, cheap)
- `gpt-4o-mini` (OpenAI, fast)
- `claude-3-5-sonnet-20241022` (Anthropic, high quality)*Uses Simon Willison's excellent [llm package](https://github.com/simonw/llm) for provider flexibility.*
## How LLM Cleaning Works
**What it does:**
- Removes filler words (um, uh, so, like, you know, etc.)
- Fixes grammar and punctuation errors
- Organizes content into logical paragraphs
- Maintains original meaning and context**Cleaning styles:**
- **presentation**: Professional tone, organized paragraphs
- **conversation**: Natural flow, minimal cleanup
- **lecture**: Educational format, clear sections for notes## Output Formats
| Format | Description | Use Case |
|--------|-------------|----------|
| **TXT** | Plain text | Articles, notes, analysis |
| **SRT** | SubRip subtitles | Video editing, accessibility |
| **VTT** | WebVTT subtitles | Web players, streaming |*Note: SRT/VTT preserve timing while cleaning text content.*