{"id":29195506,"url":"https://github.com/aastroza/tvtxt","last_synced_at":"2025-07-20T05:32:40.240Z","repository":{"id":296966969,"uuid":"993573095","full_name":"aastroza/tvtxt","owner":"aastroza","description":"[WIP] AI that \"reads\" live TV and writes it as a movie script in real-time.","archived":false,"fork":false,"pushed_at":"2025-06-03T17:21:30.000Z","size":1076,"stargazers_count":20,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-02T05:04:39.972Z","etag":null,"topics":["fasthtml","llm","modal","outlines","tv","wip","work-in-progress"],"latest_commit_sha":null,"homepage":"https://tvtxt.com/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aastroza.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-31T03:44:25.000Z","updated_at":"2025-06-26T04:06:20.000Z","dependencies_parsed_at":"2025-06-03T17:07:00.877Z","dependency_job_id":"365002eb-d745-477a-9d60-5c7de6bfa1b6","html_url":"https://github.com/aastroza/tvtxt","commit_stats":null,"previous_names":["aastroza/tvtxt"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/aastroza/tvtxt","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aastroza%2Ftvtxt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aastroza%2Ftvtxt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aastroza%2Ftvtxt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aastroza%2Ftvtxt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aastroza","download_url":"https://codeload.github.com/aastroza/tvtxt/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aastroza%2Ftvtxt/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266071519,"owners_count":23871940,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fasthtml","llm","modal","outlines","tv","wip","work-in-progress"],"created_at":"2025-07-02T05:04:40.041Z","updated_at":"2025-07-20T05:32:40.194Z","avatar_url":"https://github.com/aastroza.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# tvtxt 📺✨\n\n\u003e **⚠️ Work in Progress - Technology Showcase**  \n\u003e This is an experimental MVP demonstrating real-time AI capabilities. Not intended as a production-ready product.\n\n**Turn any live TV stream into a real-time movie script. AI watches, transcribes, and writes television as cinema.**\n\nEver wondered what your favorite TV show would look like as a screenplay? tvtxt is an AI-powered pipeline that watches live television streams and transforms them into properly formatted movie scripts in real-time. Think of it as having a tireless scriptwriter that never blinks, never sleeps, and never misses a moment.\n\n## Live Demo\nWatch the magic unfold in real-time: [tvtxt live demo](https://tvtxt.com/)\n\n![screenshot](/tvtxt.PNG)\n\n\n## Project Status\n\nThis is a **proof-of-concept showcase** built to demonstrate the integration of several cutting-edge technologies:\n- Real-time speech recognition.\n- Vision-language understanding.\n- Cloud-native AI inference.\n- Live streaming media processing.\n\n**What this is:**\n- A technology demonstration.\n- An experimental MVP.\n- A learning playground for AI + media processing.\n\n**What this is NOT:**\n- A production-ready application.\n- A commercial product.\n- A fully-featured streaming service.\n\n## The magic behind the curtain\n\n**tvtxt** combines cutting-edge AI models with cloud infrastructure to create a TV-to-screenplay transformation:\n\n\n### **[Modal](https://modal.com/)**\nModal handles our cloud GPU infrastructure, running two critical AI workloads:\n- **[Parakeet ASR Model (NVIDIA)](https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2)** : Transcribes speech with remarkable accuracy and speed.\n- **[Qwen2-VL Vision-Language Model](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct)**: Describes visual scenes with cinematic flair.\n\n### **[Outlines](https://github.com/dottxt-ai/outlines)**\nEnsures our vision model outputs perfectly formatted JSON responses:\n- **Schema enforcement**: Guarantees consistent screenplay structure.\n\n### Others\n\n- **[Azure Blob Storage:](https://azure.microsoft.com/en-us/products/storage/blobs)** Temporarily stores captured video frames for visual analysis.\n- **[Redis Cloud:](https://redis.io/cloud/)** Acts as the bridge between our backend pipeline and frontend display.\n- **[FastHTML:](https://www.fastht.ml/)** Creates our live web interface with authentic screenplay styling.\n- **[FFmpeg:](https://ffmpeg.org/)** The unsung hero that handles all media processing.\n\n## How the Magic Happens\n\n1. **🎥 Stream Capture**: FFmpeg latches onto a live TV stream, extracting both audio and video.\n2. **🎧 Audio Analysis**: Every 10 seconds, audio chunks are sent to Modal's Parakeet ASR model for transcription.\n3. **📸 Frame Extraction**: When speech is detected, FFmpeg captures a corresponding video frame.\n4. **☁️ Image Upload**: The frame is uploaded to Azure Blob Storage and gets a public URL.\n5. **👁️ Visual Understanding**: Modal's Qwen2-VL model analyzes the image and generates a screenplay-formatted scene description.\n6. **💾 Memory Update**: The latest transcription and scene description are saved to Redis Cloud.\n7. **🖥️ Live Display**: FastHTML serves a web page that auto-refreshes, showing the generated screenplay.\n8. **🔄 Repeat**: The cycle continues, creating an ever-updating script of live television.\n\n## Installation \u0026 Setup\n\n### 1. **Environment Setup**\n```bash\nuv venv\nsource venv/bin/activate  # or `.venv\\Scripts\\activate` on Windows\nuv pip install -r requirements.txt\nmodal token new\n```\n\n### 2. **Configure Your Credentials**\nCreate a `.env` file with your secret weapons:\n```env\n# Azure Blob Storage (for frame storage)\nAZURE_STORAGE_CONNECTION_STRING=your_azure_connection_string\n\n# Redis Cloud (for state management)\nREDIS_HOST=your_redis_host\nREDIS_PORT=your_redis_port\nREDIS_USERNAME=your_redis_username\nREDIS_PASSWORD=your_redis_password\n\n# HuggingFace (for model access)\nHF_TOKEN=your_huggingface_token\n\n# Modal endpoint (will be generated after deployment)\nIMAGE_DESCRIBER_URL=your_modal_endpoint_url\n```\n\n### 3. **Deploy the Vision AI**\nLaunch your scene description model to the cloud:\n```bash\nmodal deploy scene_describer.py\n```\n*Note: Copy the generated endpoint URL to your `.env` file as `IMAGE_DESCRIBER_URL`*\n\n### 4. **Start the Show**\nFire up the transcription pipeline:\n```bash\nmodal run ingest.py\n```\n\n### 5. **Watch the Magic**\nLaunch the web interface:\n```bash\ncd app\npython main.py\n```\n\nOpen your browser to `http://localhost:5001` and watch as live TV transforms into screenplay format before your eyes!\n\n## Philosophy\n\ntvtxt embraces ephemerality by design. Like live theater, each moment exists only in the present:\n- **No databases**: Only the current scene matters.\n- **No history**: Previous scripts vanish like morning mist.\n- **No storage**: Frames and audio exist only long enough to be processed.\n\n\n## Disclaimer\n\nThis project demonstrates real-time AI transcription and visual analysis using Al Jazeera English as a public live stream. No content is stored, archived, or redistributed. The system processes live broadcasts in real-time for educational and demonstration purposes only.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faastroza%2Ftvtxt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faastroza%2Ftvtxt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faastroza%2Ftvtxt/lists"}