https://github.com/tameronline/tol_vido

Last synced: 9 months ago
JSON representation

Host: GitHub
URL: https://github.com/tameronline/tol_vido
Owner: TamerOnLine
License: mit
Created: 2025-07-15T23:13:42.000Z (12 months ago)
Default Branch: main
Last Pushed: 2025-07-24T13:49:48.000Z (11 months ago)
Last Synced: 2025-07-24T18:05:11.160Z (11 months ago)
Language: Python
Size: 66.3 MB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# 🎬 tol_vido – YouTube to Clean Text (LLM Powered)

[![Build](https://github.com/TamerOnLine/pro_venv/actions/workflows/test-pro_venv.yml/badge.svg)](https://github.com/TamerOnLine/pro_venv/actions)
[![License](https://img.shields.io/github/license/TamerOnLine/pro_venv?style=flat-square)](LICENSE)

`tol_vido` is a powerful, local-first Python tool that transcribes YouTube videos into well-formatted German paragraphs using Large Language Models (LLMs) like **Mistral** via `.gguf`. It's fast, customizable, and runs entirely on your machine.

---

## 🚀 Features

- ✅ Download and convert YouTube audio with `yt-dlp`
- ✅ Segment and transcribe speech with `SpeechRecognition` (Google)
- ✅ Format the raw text using a local LLM (e.g. Mistral via `llama-cpp-python`)
- ✅ Full local processing — **no cloud required**
- ✅ Modular and extensible codebase
- ✅ Auto-setup: project config, `venv`, VS Code files, and more

---

## 🧱 Project Structure

```bash
tol_vido/
├── 01_pro_venv.py # Setup: venv, config, VS Code
├── 02_start_llm_server.py # Run local LLM server
├── 03_main.py # Entry point runner
├── app.py # Main processing logic
├── extract_text_from_video.py # Transcribe audio from video
├── format_text.py # Format using local LLM
├── output.txt # Raw transcription result
├── formatted_output.txt # Final cleaned-up version
├── setup-config.json # Project metadata
├── requirements.txt # Dependencies
├── env-info.txt # Python & installed packages
├── tools/
│ └── download_model.py # GGUF model downloader
└── .github/workflows/ # GitHub Actions
```

---

## ▶️ Getting Started

### 1. Clone the repository

```bash
git clone https://github.com/YourUsername/tol_vido.git
cd tol_vido
```

### 2. Run the setup script

```bash
python 01_pro_venv.py
```

This will:
- Create a virtual environment
- Install dependencies
- Generate `main.py`, `app.py`, `.vscode/` settings

### 3. Start the LLM server
### ⚠️ Important: Pre-Run Setup

Before running the main script, make sure the following steps are completed:

#### 📁 1. Download and place FFmpeg binaries
This project requires FFmpeg tools for audio conversion.

- Download from: [https://www.gyan.dev/ffmpeg/builds/](https://www.gyan.dev/ffmpeg/builds/)
- Extract and copy the following files into a folder named `bin/` in the project root:

```bash
bin/
├── ffmpeg.exe
├── ffprobe.exe
├── ffplay.exe # (optional)
```

> You **do not need** to add FFmpeg to your system PATH. The project uses them directly from `./bin/`.

---

#### 📄 2. Create `.env` file

In the project root, create a file named `.env` with the following content:

```env
# Required for downloading and running the model
MODEL_NAME=mistral-7b-instruct-v0.1.Q4_K_M.gguf

# Local path to the GGUF model file (adjust this path to match your setup)
MODEL=r:/tol_vido/models/mistral-7b-instruct-v0.1.Q4_K_M.gguf

# LLM Server runtime parameters
N_CTX=4096
N_THREADS=8
N_GPU_LAYERS=35
VERBOSE=true
```

These values are used by:
- `tools/download_model.py` – to fetch the model from Hugging Face
- `02_start_llm_server.py` – to launch the local LLM inference server

```bash
python 02_start_llm_server.py
```

> Make sure the `.gguf` model exists in the `models/` folder. You can use `tools/download_model.py` to fetch it from Hugging Face.

### 4. Run the full pipeline

```bash
python 03_main.py
```

You’ll be prompted for a YouTube URL. After processing, the result will be saved in:

- `output.txt`: raw transcription
- `formatted_output.txt`: structured, punctuated paragraphs

---

## 📦 Requirements
---

## 🌐 Model Setup (GGUF)
### 2. Download the model

```bash
python tools/download_model.py
```

Model will be downloaded from:
> [TheBloke/Mistral-7B-Instruct-v0.1-GGUF](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF)

---

## 💡 Notes

- All audio processing and formatting are local.
- Transcription uses `Google Speech Recognition` — no API key required.
- Formatting uses a **local** LLM server (default: `localhost:11434`).
- Works offline once the model is downloaded.

---

## 🛠 VS Code Support

The setup script automatically generates:

- `.vscode/settings.json` → sets interpreter path
- `.vscode/launch.json` → debugger config
- `project.code-workspace` → pre-configured workspace

---

## 📜 License

This project is licensed under the [MIT License](LICENSE).
You are free to use, modify, and distribute it with attribution.
Feel free to explore and build upon it!

---

## 🙌 Acknowledgements

This project would not have been possible without the following open-source tools and libraries. Sincere thanks to their developers and contributors:

- [Mistral 7B](https://mistral.ai) – for providing the foundational language model.
- [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) – for efficient local inference of LLMs.
- [yt-dlp](https://github.com/yt-dlp/yt-dlp) – for reliable YouTube media extraction.
- [tiktoken](https://github.com/openai/tiktoken) – for fast and accurate tokenization.

---

## 👨‍💻 About the Author

🎯 **Tamer OnLine – Developer & Architect**
A dedicated software engineer and educator with a focus on building multilingual, modular, and open-source applications using Python, Flask, and PostgreSQL.

🔹 Founder of **Flask University** – an initiative to create real-world, open-source Flask projects
🔹 Creator of [@TamerOnPi](https://www.youtube.com/@mystrotamer) – a YouTube channel sharing tech, tutorials, and Pi Network insights
🔹 Passionate about helping developers learn by building, one milestone at a time

Connect or contribute:

[![GitHub](https://img.shields.io/badge/GitHub-TamerOnLine-181717?style=flat&logo=github)](https://github.com/TamerOnLine)
[![LinkedIn](https://img.shields.io/badge/LinkedIn-Profile-blue?style=flat&logo=linkedin)](https://www.linkedin.com/in/tameronline/)
[![YouTube](https://img.shields.io/badge/YouTube-TamerOnPi-red?style=flat&logo=youtube)](https://www.youtube.com/@mystrotamer)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/tameronline/tol_vido

Awesome Lists containing this project

README