{"id":31642460,"url":"https://github.com/tameronline/tol_vido","last_synced_at":"2025-10-07T03:59:37.685Z","repository":{"id":304891347,"uuid":"1020464082","full_name":"TamerOnLine/tol_vido","owner":"TamerOnLine","description":null,"archived":false,"fork":false,"pushed_at":"2025-07-24T13:49:48.000Z","size":69568,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-24T18:05:11.160Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/TamerOnLine.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-15T23:13:42.000Z","updated_at":"2025-07-24T13:49:52.000Z","dependencies_parsed_at":"2025-07-17T03:52:36.752Z","dependency_job_id":"6dc06695-83fb-4226-8577-3f4dd07ded52","html_url":"https://github.com/TamerOnLine/tol_vido","commit_stats":null,"previous_names":["tameronline/tol_vido"],"tags_count":0,"template":false,"template_full_name":"TamerOnLine/pro_venv","purl":"pkg:github/TamerOnLine/tol_vido","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TamerOnLine%2Ftol_vido","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TamerOnLine%2Ftol_vido/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TamerOnLine%2Ftol_vido/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TamerOnLine%2Ftol_vido/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/TamerOnLine","download_url":"https://codeload.github.com/TamerOnLine/tol_vido/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TamerOnLine%2Ftol_vido/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278717434,"owners_count":26033542,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-07T02:00:06.786Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-10-07T03:59:32.477Z","updated_at":"2025-10-07T03:59:37.680Z","avatar_url":"https://github.com/TamerOnLine.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🎬 tol_vido – YouTube to Clean Text (LLM Powered)\n\n[![Build](https://github.com/TamerOnLine/pro_venv/actions/workflows/test-pro_venv.yml/badge.svg)](https://github.com/TamerOnLine/pro_venv/actions)\n[![License](https://img.shields.io/github/license/TamerOnLine/pro_venv?style=flat-square)](LICENSE)\n\n`tol_vido` is a powerful, local-first Python tool that transcribes YouTube videos into well-formatted German paragraphs using Large Language Models (LLMs) like **Mistral** via `.gguf`. It's fast, customizable, and runs entirely on your machine.\n\n---\n\n## 🚀 Features\n\n- ✅ Download and convert YouTube audio with `yt-dlp`\n- ✅ Segment and transcribe speech with `SpeechRecognition` (Google)\n- ✅ Format the raw text using a local LLM (e.g. Mistral via `llama-cpp-python`)\n- ✅ Full local processing — **no cloud required**\n- ✅ Modular and extensible codebase\n- ✅ Auto-setup: project config, `venv`, VS Code files, and more\n\n---\n\n## 🧱 Project Structure\n\n```bash\ntol_vido/\n├── 01_pro_venv.py          # Setup: venv, config, VS Code\n├── 02_start_llm_server.py  # Run local LLM server\n├── 03_main.py              # Entry point runner\n├── app.py                  # Main processing logic\n├── extract_text_from_video.py  # Transcribe audio from video\n├── format_text.py          # Format using local LLM\n├── output.txt              # Raw transcription result\n├── formatted_output.txt    # Final cleaned-up version\n├── setup-config.json       # Project metadata\n├── requirements.txt        # Dependencies\n├── env-info.txt            # Python \u0026 installed packages\n├── tools/\n│   └── download_model.py   # GGUF model downloader\n└── .github/workflows/      # GitHub Actions\n```\n\n---\n\n## ▶️ Getting Started\n\n### 1. Clone the repository\n\n```bash\ngit clone https://github.com/YourUsername/tol_vido.git\ncd tol_vido\n```\n\n### 2. Run the setup script\n\n```bash\npython 01_pro_venv.py\n```\n\nThis will:\n- Create a virtual environment\n- Install dependencies\n- Generate `main.py`, `app.py`, `.vscode/` settings\n\n### 3. Start the LLM server\n### ⚠️ Important: Pre-Run Setup\n\nBefore running the main script, make sure the following steps are completed:\n\n#### 📁 1. Download and place FFmpeg binaries\nThis project requires FFmpeg tools for audio conversion.\n\n- Download from: [https://www.gyan.dev/ffmpeg/builds/](https://www.gyan.dev/ffmpeg/builds/)\n- Extract and copy the following files into a folder named `bin/` in the project root:\n\n```bash\nbin/\n├── ffmpeg.exe\n├── ffprobe.exe\n├── ffplay.exe      # (optional)\n```\n\n\u003e You **do not need** to add FFmpeg to your system PATH. The project uses them directly from `./bin/`.\n\n---\n\n#### 📄 2. Create `.env` file\n\nIn the project root, create a file named `.env` with the following content:\n\n```env\n# Required for downloading and running the model\nMODEL_NAME=mistral-7b-instruct-v0.1.Q4_K_M.gguf\n\n# Local path to the GGUF model file (adjust this path to match your setup)\nMODEL=r:/tol_vido/models/mistral-7b-instruct-v0.1.Q4_K_M.gguf\n\n# LLM Server runtime parameters\nN_CTX=4096\nN_THREADS=8\nN_GPU_LAYERS=35\nVERBOSE=true\n```\n\nThese values are used by:\n- `tools/download_model.py` – to fetch the model from Hugging Face\n- `02_start_llm_server.py` – to launch the local LLM inference server\n\n\n```bash\npython 02_start_llm_server.py\n```\n\n\u003e Make sure the `.gguf` model exists in the `models/` folder. You can use `tools/download_model.py` to fetch it from Hugging Face.\n\n### 4. Run the full pipeline\n\n```bash\npython 03_main.py\n```\n\nYou’ll be prompted for a YouTube URL. After processing, the result will be saved in:\n\n- `output.txt`: raw transcription  \n- `formatted_output.txt`: structured, punctuated paragraphs\n\n---\n\n## 📦 Requirements\n---\n\n## 🌐 Model Setup (GGUF)\n### 2. Download the model\n\n```bash\npython tools/download_model.py\n```\n\nModel will be downloaded from:\n\u003e [TheBloke/Mistral-7B-Instruct-v0.1-GGUF](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF)\n\n---\n\n## 💡 Notes\n\n- All audio processing and formatting are local.\n- Transcription uses `Google Speech Recognition` — no API key required.\n- Formatting uses a **local** LLM server (default: `localhost:11434`).\n- Works offline once the model is downloaded.\n\n---\n\n## 🛠 VS Code Support\n\nThe setup script automatically generates:\n\n- `.vscode/settings.json` → sets interpreter path\n- `.vscode/launch.json` → debugger config\n- `project.code-workspace` → pre-configured workspace\n\n---\n\n## 📜 License\n\nThis project is licensed under the [MIT License](LICENSE).  \nYou are free to use, modify, and distribute it with attribution.  \nFeel free to explore and build upon it!\n\n---\n\n## 🙌 Acknowledgements\n\nThis project would not have been possible without the following open-source tools and libraries. Sincere thanks to their developers and contributors:\n\n- [Mistral 7B](https://mistral.ai) – for providing the foundational language model.\n- [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) – for efficient local inference of LLMs.\n- [yt-dlp](https://github.com/yt-dlp/yt-dlp) – for reliable YouTube media extraction.\n- [tiktoken](https://github.com/openai/tiktoken) – for fast and accurate tokenization.\n\n---\n\n## 👨‍💻 About the Author\n\n🎯 **Tamer OnLine – Developer \u0026 Architect**  \nA dedicated software engineer and educator with a focus on building multilingual, modular, and open-source applications using Python, Flask, and PostgreSQL.\n\n🔹 Founder of **Flask University** – an initiative to create real-world, open-source Flask projects  \n🔹 Creator of [@TamerOnPi](https://www.youtube.com/@mystrotamer) – a YouTube channel sharing tech, tutorials, and Pi Network insights  \n🔹 Passionate about helping developers learn by building, one milestone at a time\n\nConnect or contribute:\n\n[![GitHub](https://img.shields.io/badge/GitHub-TamerOnLine-181717?style=flat\u0026logo=github)](https://github.com/TamerOnLine)  \n[![LinkedIn](https://img.shields.io/badge/LinkedIn-Profile-blue?style=flat\u0026logo=linkedin)](https://www.linkedin.com/in/tameronline/)  \n[![YouTube](https://img.shields.io/badge/YouTube-TamerOnPi-red?style=flat\u0026logo=youtube)](https://www.youtube.com/@mystrotamer)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftameronline%2Ftol_vido","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftameronline%2Ftol_vido","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftameronline%2Ftol_vido/lists"}