https://github.com/modem-dev/podguy
Agent-driven post-production workflow for video podcasts
https://github.com/modem-dev/podguy
Last synced: 4 days ago
JSON representation
Agent-driven post-production workflow for video podcasts
- Host: GitHub
- URL: https://github.com/modem-dev/podguy
- Owner: modem-dev
- License: mit
- Created: 2026-05-15T14:25:26.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2026-06-10T16:31:03.000Z (6 days ago)
- Last Synced: 2026-06-10T18:13:47.627Z (5 days ago)
- Language: Python
- Homepage:
- Size: 257 KB
- Stars: 8
- Watchers: 0
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Agents: AGENTS.md
Awesome Lists containing this project
README
# podguy
Pi-first post-production tooling for podcast and video-podcast editors who want transcripts, chapters, clip candidates, cuts, and social review exports from local media.
[](https://github.com/modem-dev/podguy/actions/workflows/ci.yml)
## What it does
- Launches `pi` with podcast-specific skills, prompts, and a startup widget.
- Transcribes local audio/video files with Whisper-compatible backends.
- Prepares long transcripts into timecoded chunks for editorial review.
- Scans video episodes for likely interstitials and non-host inserts.
- Generates chapters, clip candidates, cut reports, show notes, quotes, and proper noun checks.
- Cuts selected highlight ranges into review exports for TikTok, Reels, YouTube Shorts, trailers, or social posts.
Generated transcripts, scans, thumbnails, notes, and clip exports go under gitignored `dist/` by default.
## Install
Clone the repo and install the local Node tooling:
```bash
git clone https://github.com/modem-dev/podguy.git
cd podguy
npm install
```
Install the required system tools:
```bash
brew install uv ffmpeg
```
Notes:
- `ffmpeg` is used for fixtures, transcription backends, and clip cutting.
- `uv` runs the Python scripts and optional transcription dependency groups.
- The video scanner is macOS-only and uses Swift / AVFoundation / Vision.
Before first use, authenticate pi with `/login` or your usual provider API key setup.
Set up a real transcription backend when you are ready to transcribe episodes:
```bash
uv sync --group transcribe-mlx # Apple Silicon
uv sync --group transcribe-faster # Cross-platform
uv sync --group transcribe-whisper # OpenAI Whisper package
```
## Quick start
Create an optional show profile:
```bash
cp podguy.example.toml podguy.toml
```
Start podguy from the repo root:
```bash
./podguy
```
Then ask pi for a concrete episode task:
```text
Analyze "episode-006-draft.mp4" as ep006.
Generate chapters for ep006 in timestamp-title format.
Find likely TikTok/Shorts clips for ep006 and cut vertical review exports.
```
For broad requests, podguy should clarify between:
- **quick pass**: optional video scan + transcript + prepared transcript artifacts + short summary
- **full review**: quick pass + chapters + clips + cuts + show notes + quotes + proper noun review
## Common workflows
### Scan a video
```bash
swift scripts/scan_podcast.swift "episode-006-draft.mp4" dist/analysis/ep006/scan 0.5
open dist/analysis/ep006/scan/report.html
```
Key outputs:
- `interstitial_candidates.csv`
- `non_host_candidates.csv`
- `report.html`
- `thumbs/`
Scanner results are heuristic review aids, not exact edit points.
### Transcribe media
```bash
uv run python scripts/transcribe_video.py --list-backends
uv run --group transcribe-mlx python scripts/transcribe_video.py \
"episode-006-draft.mp4" \
dist/analysis/ep006/transcript \
--backend mlx-whisper
```
Key outputs:
- `segments.json`
- `transcript.txt`
- `transcript.srt`
- `transcript.vtt`
- `summary.txt`
Use `--backend mock` only for tests and setup validation.
### Prepare transcript artifacts
```bash
uv run python scripts/prepare_transcript_analysis.py \
dist/analysis/ep006/transcript \
--output-dir dist/analysis/ep006 \
--slug ep006 \
--plain-output-names
```
Key outputs:
- `dist/analysis/ep006/transcript_chunks.md`
- `dist/analysis/ep006/transcript_index.json`
These are the main inputs for chaptering and editorial analysis.
### Cut selected clips
After pi writes `dist/analysis/ep006/clips.md`, cut original-aspect review exports:
```bash
uv run python scripts/cut_clips.py \
"episode-006-draft.mp4" \
dist/analysis/ep006/clips.md \
dist/analysis/ep006/clips/cuts
```
For simple vertical Shorts/TikTok/Reels review exports:
```bash
uv run python scripts/cut_clips.py \
"episode-006-draft.mp4" \
dist/analysis/ep006/clips.md \
dist/analysis/ep006/clips/shorts \
--aspect vertical \
--pad-start 1 \
--pad-end 1
```
The cutter writes generated media plus `manifest.json`. Vertical and square modes use center-crop framing, so treat them as review exports unless the framing has been checked.
### Download real sample media
Use the Cordkillers open-license video-podcast excerpt for local evaluation:
```bash
scripts/download_sample_media.sh
```
This writes to:
```text
dist/test-fixtures/open-license/cordkillers-572/
```
The default sample is a 3m50s excerpt from `00:08:00` of Cordkillers 572, licensed CC BY-SA 4.0. The range includes multiple podcast layouts, lower thirds, chat/sidebar graphics, a Patreon bumper, and an outro/interstitial card. The script writes `ATTRIBUTION.md` next to the generated media.
## Configuration
`podguy.toml` lets you define show-specific context without changing the workflow:
```toml
show_name = "Example Podcast"
show_slug = "example"
hosts = ["Host One", "Host Two"]
tone = "curious, direct, practical"
audience = "builders and technical operators"
chapter_style = "concise descriptive titles"
preferred_review = "quick pass"
```
`podcast.toml` is also accepted as a compatible profile name.
## Project layout
- [`podguy`](podguy): launcher for pi with repo-local skills, prompts, and startup extension.
- [`src/podguy-post-production/SKILL.md`](src/podguy-post-production/SKILL.md): main editorial workflow skill.
- [`src/podguy-clip-cutter/SKILL.md`](src/podguy-clip-cutter/SKILL.md): social clip export workflow skill.
- [`src/podguy-startup.ts`](src/podguy-startup.ts): pi startup widget.
- [`prompts/`](prompts): optional prompt shortcuts.
- [`scripts/`](scripts): deterministic scanner, transcript, prep, fixture, sample, and clip-cutting tools.
- [`tests/`](tests): smoke tests wrapped by Vitest.
- [`AGENTS.md`](AGENTS.md): repo guidance for coding agents.
## Development
Run the full validation surface:
```bash
npm run format:check
npm run lint
npm run typecheck
npm test
```
Run the shell smoke tests directly:
```bash
bash scripts/test.sh
```
CI runs the same checks on macOS because the scanner depends on macOS media APIs.
## Contributing
Small, focused PRs are welcome. Before opening a PR, run the validation commands above.
For workflow or heuristic changes, include:
- media type and OS/backend details
- expected vs actual output
- relevant transcript, scan, or manifest paths when available
See [`CHANGELOG.md`](CHANGELOG.md) for user-visible changes and [`AGENTS.md`](AGENTS.md) for repo maintenance guidance.
## Security
This repo does not have a published security policy yet. If you find a sensitive issue, do not open a public issue. Contact the maintainers privately first.
## Sponsor
Sponsored by [Modem](https://modem.dev?utm_source=github&utm_medium=oss&utm_campaign=oss_podguy&utm_content=readme_footer).
## License
MIT. See [`LICENSE`](LICENSE).
## Support
Use [GitHub issues](https://github.com/modem-dev/podguy/issues) for bugs, questions, and workflow discussion.