https://github.com/itsmevictor/clean-transcribe
A simple CLI to transcribe Youtube videos or local audio/video files and produce LLM-cleaned transcripts for analysis, reading, or subtitles.
https://github.com/itsmevictor/clean-transcribe
cli llm speech-to-text subtitles transcription whisper youtube
Last synced: 4 months ago
JSON representation
A simple CLI to transcribe Youtube videos or local audio/video files and produce LLM-cleaned transcripts for analysis, reading, or subtitles.
- Host: GitHub
- URL: https://github.com/itsmevictor/clean-transcribe
- Owner: itsmevictor
- License: mit
- Created: 2025-06-22T18:53:49.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-11-06T21:52:24.000Z (8 months ago)
- Last Synced: 2025-11-06T23:28:39.525Z (8 months ago)
- Topics: cli, llm, speech-to-text, subtitles, transcription, whisper, youtube
- Language: Python
- Homepage:
- Size: 80.1 KB
- Stars: 18
- Watchers: 1
- Forks: 3
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Clean Transcriber
A command-line tool to turn any YouTube video, local audio or video file into a clean, readable text transcript. It uses the transcription model of your choice (local or API-based) for transcription and your preferred LLM to automatically clean and reformat the output.
## Features
1. **Multiple input formats**: Supports various audio and video formats for flexible usage (e.g., YouTube URL, `.mp3`, `.wav`, `.m4a`, `.opus`, `.mp4`, `.mkv`, `.mov`).
2. **Multiple output format**: Generate clean transcripts in TXT, SRT, or VTT formats.
3. **Flexible transcription models**: Choose from various local (Whisper, Voxtral) and API-based (OpenAI, Gemini, Mistral) models for different use cases.
5. **LLM-powered cleaning** that removes filler words, fixes grammar, and organizes content into readable paragraphs.
6. **Wide LLM support** - use Gemini, ChatGPT, Claude or any other (local) LLM for cleaning.
## Quick Start
```bash
# Transcribe a YouTube video
clean-transcribe "https://www.youtube.com/watch?v=VIDEO_ID"
# Transcribe a local video file
clean-transcribe "/path/to/your/video.mp4"
# Transcribe a specific segment of a video
clean-transcribe "https://www.youtube.com/watch?v=VIDEO_ID" --start "1:30" --end "2:30"
# Create clean subtitles from a video
clean-transcribe "https://www.youtube.com/watch?v=VIDEO_ID" -f srt -o subtitles.srt
```
## Installation
**Option 1: use pip:**
```bash
pip install clean-transcribe
```
**Option 2: Install as package**
```bash
git clone https://github.com/itsmevictor/clean-transcribe && cd clean-transcribe
pip install -e .
clean-transcribe "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
```
## Configuration
### Key Options
- `--format, -f`: Output format (txt, srt, vtt)
- `--model, -m`: Transcription model (whisper-tiny, whisper-base, whisper-small, whisper-medium, whisper-large, whisper-turbo, whisper-1-api, gpt-4o-transcribe-api, gpt-4o-mini-transcribe-api, gemini-2.5-pro-api, gemini-2.5-flash-api, gemini-2.5-flash-lite-api, gemini-2.0-flash-api, voxtral-mini-api, voxtral-small-api, voxtral-mini-local, voxtral-small-local)
- `--start`: Start time for transcription (e.g., "1:30")
- `--end`: End time for transcription (e.g., "2:30")
- `--transcription-prompt`: Custom prompt for Whisper to guide transcription
- `--llm-model`: LLM for cleaning (gemini-2.0-flash-exp, gpt-4o-mini, etc.)
- `--cleaning-style`: presentation, conversation, or lecture
- `--save-raw`: Keep both raw and cleaned versions
- `--no-clean`: Skip AI cleaning
## LLM-Powered Cleaning Setup
### Quick Setup (Recommended)
```bash
# Install and configure Gemini (fast + cost-effective)
llm install llm-gemini
llm keys set gemini
# Enter your Gemini API key when prompted
# Or use any other LLM provider
# OpenAI
llm keys set openai
# Anthropic Claude
llm install llm-claude-3
llm keys set claude
```
*Uses Simon Willison's excellent [llm package](https://github.com/simonw/llm) for provider flexibility.*
### Cleaning Process
**What it does:**
- Removes filler words (um, uh, so, like, you know, etc.)
- Fixes grammar and punctuation errors
- Organizes content into logical paragraphs
- Maintains original meaning and context
**Cleaning styles:**
- **presentation**: Professional tone, organized paragraphs
- **conversation**: Natural flow, minimal cleanup
- **lecture**: Educational format, clear sections for notes
*Note: SRT/VTT preserve timing while cleaning text content.*
## Feedback
I'd love to hear your feedback! If you encounter any issues, have suggestions for new features, or just want to share your experience, please don't hesitate to [open an issue](https://github.com/itsmevictor/clean-transcribe/issues).