{"id":50338264,"url":"https://github.com/busrarafa/podcast-autopilot","last_synced_at":"2026-05-29T15:00:52.798Z","repository":{"id":358936080,"uuid":"1243754236","full_name":"BusraRafa/podcast-autopilot","owner":"BusraRafa","description":null,"archived":false,"fork":false,"pushed_at":"2026-05-19T17:03:23.000Z","size":32,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-19T20:39:48.807Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BusraRafa.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-19T16:24:03.000Z","updated_at":"2026-05-19T17:03:27.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/BusraRafa/podcast-autopilot","commit_stats":null,"previous_names":["busrarafa/podcast-autopilot"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/BusraRafa/podcast-autopilot","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BusraRafa%2Fpodcast-autopilot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BusraRafa%2Fpodcast-autopilot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BusraRafa%2Fpodcast-autopilot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BusraRafa%2Fpodcast-autopilot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BusraRafa","download_url":"https://codeload.github.com/BusraRafa/podcast-autopilot/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BusraRafa%2Fpodcast-autopilot/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33657690,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-29T02:00:06.066Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-29T15:00:51.636Z","updated_at":"2026-05-29T15:00:52.789Z","avatar_url":"https://github.com/BusraRafa.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🎙️ Podcast Autopilot\n\n\u003e **AI-powered podcast summarization pipeline** — transcribe, intelligently trim, and export a condensed highlight reel from any audio file using OpenAI Whisper and GPT-4 Turbo.\n\n---\n\n## 📌 Overview\n\n**Podcast Autopilot** is an end-to-end audio processing pipeline that automatically transforms long-form podcast or lecture recordings into tight, high-signal highlight clips — without touching a single audio editor. You drop in an MP3 or WAV file, set a target length (e.g., 30% of the original), and the system handles everything: transcription, intelligent segment selection, audio cropping, and final export.\n\nThe tool is designed for content creators, researchers, and developers who want to extract the most valuable moments from hours of audio — automatically and at scale.\n\n---\n\n## ✨ Features\n\n- **Automatic Speech-to-Text** — Transcribes audio using OpenAI Whisper with word-level timestamps\n- **Timestamped Transcript Formatting** — Groups words into readable lines, split on natural speech pauses (\u003e1 second)\n- **AI-Driven Segment Selection** — Uses GPT-4 Turbo to identify and select only the highest-value segments from the transcript\n- **Target Duration Control** — Specify an exact percentage of the original runtime to keep (default: 30%)\n- **Retry \u0026 Validation Logic** — Automatically re-queries the model if the selected duration is off-target, ensuring accuracy within ±10%\n- **Audio Cropping \u0026 Stitching** — Extracts the selected segments and joins them with smooth 800ms silence padding\n- **Dual Export** — Outputs both a high-quality WAV and a 192kbps MP3 of the final edited audio\n- **Django-Ready Utility Module** — `utils.py` is structured for integration into a Django web application with settings-based API key management\n- **WAV → MP3 Auto-Conversion** — Accepts both WAV and MP3 inputs; WAV files are automatically converted before processing\n\n---\n\n## 🗂️ Project Structure\n\n```\npodcast-autopilot/\n│\n├── main.py                  # Standalone pipeline script with retry/validation logic\n├── utils.py                 # Django-integrated version of the pipeline\n├── test_final_1.py          # Test script variant 1\n├── test_final_2.py          # Test script variant 2\n├── requirements.txt         # All Python dependencies\n├── .gitignore\n│\n├── Bishop Varden Lecture.mp3  # Sample audio file\n└── Tucker.mp3                 # Sample audio file\n```\n\n---\n\n## 🔄 How It Works\n\n```\nAudio File (MP3/WAV)\n        │\n        ▼\n  [1] Whisper Transcription\n      (word-level timestamps)\n        │\n        ▼\n  [2] Timestamp Formatting\n      (pause-aware line grouping)\n        │\n        ▼\n  [3] GPT-4 Turbo Summarization\n      (select key segments → JSON)\n        │\n        ▼\n  [4] Duration Validation \u0026 Retry\n      (re-query if off target by \u003e10%)\n        │\n        ▼\n  [5] Audio Cropping \u0026 Stitching\n      (pydub segment extraction)\n        │\n        ▼\n  [6] Export: WAV + MP3\n```\n\n---\n\n## 🛠️ Tech Stack\n\n| Technology | Purpose |\n|---|---|\n| **Python 3.x** | Core language |\n| **OpenAI Whisper** (`whisper-1`) | Speech-to-text transcription with word timestamps |\n| **GPT-4 Turbo** | Intelligent segment selection and summarization |\n| **pydub** | Audio cropping, stitching, and format conversion |\n| **ffmpeg** | Audio backend for pydub |\n| **python-dotenv** | Environment variable management |\n| **Django** | Web framework integration (via `utils.py`) |\n\n---\n\n## ⚙️ Setup \u0026 Installation\n\n### Prerequisites\n\n- Python 3.9+\n- [ffmpeg](https://ffmpeg.org/download.html) installed and available in your system PATH\n- An OpenAI API key\n\n### 1. Clone the repository\n\n```bash\ngit clone https://github.com/BusraRafa/podcast-autopilot.git\ncd podcast-autopilot\n```\n\n### 2. Install dependencies\n\n```bash\npip install -r requirements.txt\n```\n\n### 3. Configure your API key\n\nCreate a `.env` file in the project root:\n\n```env\nOPENAI_API_KEY=your_openai_api_key_here\n```\n\n---\n\n## 🚀 Usage\n\n### Standalone Script (`main.py`)\n\nEdit the bottom of `main.py` to point to your audio file:\n\n```python\nif __name__ == \"__main__\":\n    audio_file = \"./your_podcast.mp3\"\n    output_folder = \"./output\"\n\n    result = process_audio_pipeline(audio_file, output_folder, target_percentage=30)\n```\n\nThen run:\n\n```bash\npython main.py\n```\n\n### Function Signature\n\n```python\nprocess_audio_pipeline(\n    audio_file_path: str,   # Path to your MP3 or WAV file\n    output_folder: str,     # Directory for all output files\n    target_percentage: int  # % of original duration to keep (default: 30)\n)\n```\n\n### Output Files\n\nAfter running, the `output_folder` will contain:\n\n| File | Description |\n|---|---|\n| `demo_transcription_formatted_output.txt` | Full timestamped transcript |\n| `output.json` | JSON array of selected segments with timestamps |\n| `\u003cname\u003e_FINAL_EDITED.wav` | Final highlight reel (WAV) |\n| `\u003cname\u003e_FINAL_EDITED.mp3` | Final highlight reel (MP3, 192kbps) |\n\n---\n\n## 📦 Key Dependencies\n\n```\nopenai==2.9.0\nopenai-whisper==20250625\npydub==0.25.1\nffmpeg-python==0.2.0\npython-dotenv==1.2.1\ntorch==2.9.1\n```\n\nSee `requirements.txt` for the full list.\n\n---\n\n## 🧠 Design Decisions\n\n**Why keep exact wording from the transcript?**\nThe pipeline instructs GPT to never paraphrase or modify source text — all selected segments are verbatim excerpts. This ensures the cropped audio matches the selected text exactly, making the JSON-to-audio alignment reliable.\n\n**Why retry logic?**\nLLMs don't always produce outputs of a precise length on the first attempt. The pipeline calculates the total duration of selected segments after each response and retries with a stricter prompt if the result deviates more than 10% from the target.\n\n**Why 800ms silence padding?**\nShort silence gaps between stitched segments make the final audio sound natural rather than abruptly cut. This value is configurable in the code.\n\n---\n\n## 🔮 Potential Extensions\n\n- Web UI via Django for drag-and-drop audio upload\n- Support for YouTube URL input (via `yt-dlp`)\n- Chapter-aware summarization for structured podcasts\n- Speaker diarization to preserve only a specific speaker\n- Batch processing for entire podcast RSS feeds\n\n---\n\n## 📄 License\n\nThis project is open source. Feel free to use, modify, and build upon it.\n\n---\n\n*Built with OpenAI Whisper + GPT-4 Turbo + pydub*\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbusrarafa%2Fpodcast-autopilot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbusrarafa%2Fpodcast-autopilot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbusrarafa%2Fpodcast-autopilot/lists"}