An open API service indexing awesome lists of open source software.

https://github.com/doziben/showrunner

Open-source AI UGC orchestrator. Paste a script, get a director-cut storyboard with lipsynced avatar clips, voiceovers, and b-roll instructions. Local-first, bring your own keys.
https://github.com/doziben/showrunner

ai ai-agent anthropic claude elevenlabs fal-ai indexeddb lipsync local-first open-source replicate sveltekit text-to-speech typesc ugc video-generation

Last synced: 4 days ago
JSON representation

Open-source AI UGC orchestrator. Paste a script, get a director-cut storyboard with lipsynced avatar clips, voiceovers, and b-roll instructions. Local-first, bring your own keys.

Awesome Lists containing this project

README

          

# Showrunner

**Open-source AI UGC orchestrator.** Paste a script. Get a director-cut storyboard, lipsynced avatar clips, voiceovers, and b-roll instructions ready for an editor.

[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](./LICENSE)
[![SvelteKit](https://img.shields.io/badge/SvelteKit-2-FF3E00.svg)](https://kit.svelte.dev)
[![TypeScript](https://img.shields.io/badge/TypeScript-strict-3178C6.svg)](https://www.typescriptlang.org/)
[![Local-first](https://img.shields.io/badge/local--first-yes-22c55e.svg)](#local-first)
[![Buy Me a Coffee](https://img.shields.io/badge/buy%20me%20a%20coffee-doziben-FFDD00?logo=buy-me-a-coffee&logoColor=000)](https://buymeacoffee.com/doziben)

---

## The pipeline this replaces

The manual UGC workflow most creators run today:

1. ~~Generating voiceovers in ElevenLabs~~
2. ~~Cutting voiceovers per scene~~
3. ~~Writing avatar prompts for each shot~~
4. ~~Pasting prompts into image gen tools~~
5. ~~Uploading images + audio to lipsync tools~~
6. ~~Downloading and labeling everything~~

Showrunner does all of it from one paste of a script. You hand the resulting zip to a video editor and they assemble.

## What it actually is

Anyone can wrap an image-gen API and a lipsync API. **The product is the storyboard intelligence layer:** a Claude-powered director that reads your script, breaks it into shots, decides which lines should be on-camera (avatar) vs cutaway (b-roll), and orchestrates the right pipeline for each.

Concretely, you get:

- **Storyboard agent.** Claude Opus reads your script and returns a structured shot list optimized for short-form UGC pacing — hooks on-camera, demonstrations as b-roll, CTAs on-camera, pattern interrupts every 8–12 seconds.
- **Locked avatar consistency.** Pick or generate a reference portrait once. Every per-scene shot uses image-to-image generation with that locked reference so the same person appears throughout.
- **Voiceover with v3 audio tags.** ElevenLabs v3 model. `[confident]`, `[pause]`, `[exasperated sigh]` pass straight through to the synthesized audio.
- **Pluggable lipsync.** Three models, choose per project — see [Lipsync models](#lipsync-models).
- **B-roll instructions.** Concrete, brutal recording instructions per cutaway. *"Open ChatGPT, type 'give me content ideas for my brand', let it generate, scroll through the boring output for 3 seconds."* Not *"show ChatGPT failing."*
- **Per-scene retry.** If one provider fails, retry that scene without rerunning the whole project.
- **Local-first.** All keys and projects live in your browser via IndexedDB. No backend, no auth, nothing leaves your machine except API calls to the providers you configured.

## Status

| Area | State |
| --- | --- |
| Storyboard agent | ✅ Ships with Claude Opus 4.7 via Vercel AI Gateway or Anthropic direct |
| Avatar generation | ✅ Replicate `openai/gpt-image-2` at `quality=high` for portraits and per-scene shots (edit mode for identity preservation) |
| Voiceover | ✅ ElevenLabs v3 with audio-tag pass-through |
| Lipsync | ✅ Three models: PrunaAI P-Video, Veed Fabric 1.0, Creatify Aurora |
| Bundle export | ✅ Numbered MP4s + MP3s + b-roll markdowns + master README |
| Settings + re-edit | ✅ Re-run any step, swap providers, manage voices |
| Multi-avatar projects | ❌ One avatar per project (see ROADMAP) |
| Auto video stitching | ❌ Hand-off to editors today |
| Hosted SaaS | ❌ Not in v0.x |

## Quick start

**Prerequisites:**

- Node 20 or newer
- pnpm 8+ (recommended) or npm

```bash
git clone showrunner
cd showrunner
pnpm install
pnpm dev
```

Open . The app boots into a 4-step onboarding wizard. You'll need API keys for the providers below — the wizard tests each connection live.

To produce a static deployable bundle:

```bash
pnpm build # outputs to ./build (static SPA)
pnpm preview # serves the production build locally
```

## Required API keys

You'll need accounts with:

| Provider | Used for | Get key |
| --- | --- | --- |
| Vercel AI Gateway *(or Anthropic direct)* | Storyboard generation (`claude-opus-4-7`) | · |
| Replicate | Avatar image generation (`openai/gpt-image-2` @ quality=high) and the P-Video lipsync model | |
| ElevenLabs | Voiceovers (`eleven_v3`) | |
| fal.ai | Lipsync (Veed Fabric 1.0, Creatify Aurora) | |

### Cost ballpark

For a 60-second UGC video with ~30 seconds of avatar footage and the cheapest lipsync model (P-Video):

| Step | Cost |
| --- | --- |
| Voiceovers (ElevenLabs) | ~$0.30 |
| Avatar images (Replicate `gpt-image-2` high) | ~$0.51 (4 avatar scenes × $0.128) |
| Lipsync (P-Video @ $0.02/sec) | ~$0.60 |
| Storyboard agent | negligible |
| **Total** | **~$1.41** |

Showrunner displays the full breakdown — including the projected per-model lipsync cost — *before* you generate, so there are no surprises.

## Lipsync models

Choose per project. Cost shown live in the storyboard view.

| Model | Provider | Rate | Resolution | Best for |
| --- | --- | --- | --- | --- |
| **PrunaAI P-Video** | Replicate | **$0.02/sec** | 720p | Cheapest. Optimized variant. Max 10s/clip. |
| **Veed Fabric 1.0** | fal.ai | **$0.08/sec** | 480p | Reliable middle. Wide codec support. |
| **Creatify Aurora** | fal.ai | **$0.14/sec** | 720p | Highest quality. Polished output. |

Default is P-Video. Switch in the right rail of the storyboard view at any time — cost recalculates instantly.

To add a new model: see `src/lib/pipeline/lipsync-models.ts` for the catalog format and `src/lib/pipeline/lipsync.ts` for the dispatcher.

## ElevenLabs v3 audio tags

Voiceovers run on `eleven_v3`, which natively interprets emotion and pacing tags written inline in your script. Showrunner passes them through verbatim — through the storyboard agent, into each scene's `audioLine`, all the way to the TTS call. Nothing gets stripped or escaped.

Categories that work:

| Category | Examples |
| --- | --- |
| Emotion | `[confident]`, `[hesitant]`, `[sarcastic]`, `[whispering]`, `[shouting]`, `[exasperated sigh]`, `[laughs]`, `[crying]` |
| Pacing | `[pause]`, `[long pause]`, `[fast]`, `[slow]` |
| Timing | `[breathes in]`, `[breathes out]`, `[clears throat]` |
| Composite | `[slightly mocking]`, `[shifting tone, energetic]`, `[warm, inviting]` *(free-form descriptive tags work too)* |

Example script that exercises the full range — paste this verbatim into a new project:

```
[confident] Most businesses are using AI for content the wrong way.

[slightly mocking] They open ChatGPT, type "give me content ideas for my brand," [pause] and get back the same generic list every other brand in their niche is getting.

[leaning in] Here's what nobody tells you.

[exasperated sigh] So you spend an hour going back and forth, [tired] and the content still sounds mid.

[shifting tone, energetic] That's why we built the Postana calendar agent.

[warm, inviting] Comment "POSTANA" and I'll send you the link.
```

Default voice settings are `stability: 0.5`, `similarity_boost: 0.75` — the v3-recommended balance for expressive delivery without identity drift. Tune in `src/lib/pipeline/voiceover.ts` if a specific voice needs different values.

Tags don't count toward the duration estimate (we strip them for word-counting in `src/lib/helpers/duration.ts`), but they do affect the spoken output, so generated audio may run slightly longer than the estimate when you stack many `[pause]` tags.

## How it works

```
┌──────────────────────────────┐
│ Your script (paste) │
└──────────────┬───────────────┘


┌──────────────────────────────┐
│ Claude (storyboard agent) │
│ spec.md §6 prompt │
└──────────────┬───────────────┘

Scene[] (avatar / broll)


┌──────────────────────────────┐
│ Per-scene orchestrator │
│ Concurrency 2, retry x3 │
└──┬─────────────┬─────────────┘
│ │
┌───────▼──┐ ┌─────▼──────┐
│ AVATAR │ │ B-ROLL │
└───┬──────┘ └─────┬──────┘
│ │
Voiceover (ElevenLabs) Voiceover only
│ │
Image (Replicate │ ← user records
gpt-image-2 quality=high, │ this visual
edit mode w/ locked ref) │
│ │
Lipsync (P-Video / Fabric / Aurora)
│ │
└────────┬────────┘

┌──────────────────────┐
│ project.zip │
│ • 01_avatar.mp4 │
│ • 01_voiceover.mp3 │
│ • 02_broll.md │
│ • 02_voiceover.mp3 │
│ • README.md │
└──────────────────────┘
```

## Local-first

Showrunner is local-first with a thin server proxy. Concretely:

- **API keys** live in IndexedDB on the origin you run the app from. They are not synced anywhere and never persist server-side.
- **Avatars, projects, transactions, and generated outputs** (audio, images, lipsync videos as base64) all live in IndexedDB.
- **The SvelteKit `/api/*` endpoints are CORS bridges only.** They forward your request to the provider with the auth headers you supplied and stream the response back. They don't store, log, or process anything.
- **Different browsers / browser profiles get separate stores.** Run isolated instances by using two browsers.
- **Clearing site data wipes everything.** Export bundles you care about right after generation.
- **No telemetry.** Showrunner never phones home.

The reason for the proxy: Replicate, Anthropic, and most provider APIs CORS-block direct browser calls. Without a server bridge the pipeline simply doesn't work. With it, the architecture stays effectively "your machine talks to your providers" — the SvelteKit server is just a same-origin CORS hop.

See `SECURITY.md` for the full threat model.

## Customization

Most likely edits, by file:

| Want to change | Edit |
| --- | --- |
| How Claude breaks down scripts | `src/lib/pipeline/prompts.ts` (`STORYBOARD_SYSTEM_PROMPT`) |
| How avatar shots are framed | `src/lib/pipeline/prompts.ts` (`buildAvatarShotPrompt`) |
| Lipsync model pricing or add a new model | `src/lib/pipeline/lipsync-models.ts` + `src/lib/pipeline/lipsync.ts` |
| Image gen model (currently `openai/gpt-image-2` @ quality=high) | `src/lib/pipeline/avatar-image.ts` |
| Voiceover settings (stability, similarity_boost) | `src/lib/pipeline/voiceover.ts` |
| Cost estimates | `src/lib/helpers/cost.ts` (`PRICING`) |
| Tailwind tokens, theme | `src/app.css` |

Sections 6 and 7 prompts in `spec.md` are the source of truth — keep them in sync if you change the code.

## Icons

Use **Huge Icons** only ([Iconify listing](https://icon-sets.iconify.design/hugeicons)). Wire them through **`@iconify/svelte`** and the **`HIcon`** wrapper (`src/lib/components/HIcon.svelte`): pass the icon name without the `hugeicons:` prefix (for example ``). The set is registered once from `src/lib/register-hugeicons.ts`, imported by `src/routes/+layout.svelte`. Do not add Lucide, Heroicons, or other icon packs for new UI.

## Project structure

```
src/
├── routes/ # SvelteKit pages
│ ├── +layout.svelte # App shell, routing guard
│ ├── onboarding/ # 4-step wizard
│ ├── avatars/ # CRUD for locked reference portraits
│ ├── projects/ # Storyboard workspace
│ └── settings/ # Re-edit config
├── lib/
│ ├── pipeline/ # Provider integrations + orchestrator
│ │ ├── prompts.ts # Storyboard + avatar shot prompts (§6, §7)
│ │ ├── storyboard.ts # Claude via @ai-sdk/gateway or @ai-sdk/anthropic
│ │ ├── avatar-image.ts # Replicate openai/gpt-image-2 (quality=high) + per-scene edit
│ │ ├── voiceover.ts # ElevenLabs v3 TTS
│ │ ├── lipsync.ts # Dispatcher: p-video / fabric / aurora
│ │ ├── lipsync-models.ts # Model catalog (label, rate, max duration)
│ │ ├── orchestrator.ts # Bounded-parallel runner, retry, per-scene status
│ │ └── export.ts # JSZip bundle + master README
│ ├── stores/ # Reactive Svelte stores (config, avatars, projects, jobs)
│ ├── db/ # Dexie schema + migrations
│ ├── components/ # UI components (shadcn-svelte + custom)
│ ├── helpers/ # Pure utilities (cost, duration, image, audio)
│ └── types/ # Shared TypeScript types
├── app.css # Tailwind v4 + Flora-inspired tokens
└── app.html
```

## Deployment

Showrunner is a SvelteKit app with `adapter-auto` — pages are SPA, but `+server.ts` API routes proxy provider calls through CORS-friendly server endpoints. Deploy anywhere that runs Node-based serverless functions:

- **Vercel** — push the repo, Vercel detects SvelteKit, done. (`adapter-auto` selects `adapter-vercel` automatically.)
- **Cloudflare Pages / Workers** — install `@sveltejs/adapter-cloudflare` and switch the adapter import.
- **Netlify** — install `@sveltejs/adapter-netlify`, switch adapter.
- **Self-host (Node)** — install `@sveltejs/adapter-node`, switch adapter; `pnpm build && node build` to run.

There are no env vars to set. Each user pastes their own keys in the onboarding flow.

## Troubleshooting

**fal.ai upload failures on Safari.** fal.ai's storage upload occasionally has issues with Safari's strict cookie handling. Try Chrome or Firefox if the lipsync step keeps failing on Safari specifically.

**ElevenLabs returns 401 even though the key tested OK.** Voice IDs are scoped to the workspace that created them. Make sure the voice IDs you added in onboarding belong to the same ElevenLabs workspace as the API key.

**Storyboard agent returns scenes that don't follow the spec rules.** The model is non-deterministic. Re-running usually fixes drift. If a specific kind of script consistently breaks the rules, edit `STORYBOARD_SYSTEM_PROMPT` in `src/lib/pipeline/prompts.ts` to add a rule.

**Browser storage filling up.** Generated lipsync videos can be large. The detail screen of each project shows you what's stored; export and delete completed projects to free space.

**Reset the app entirely.** Settings → "Reset configuration", or clear site data via your browser's developer tools.

## Spec

The full build spec lives in `spec.md`. It documents the data models, the storyboard agent prompt, the avatar shot template, the locked tech stack, the user flows, the acceptance criteria, and what's explicitly out of scope. Read it before making non-trivial changes.

## Roadmap

See `ROADMAP.md` for what's parked for future versions (multi-avatar projects, video stitching, captions, hosted SaaS).

## Contributing

See `CONTRIBUTING.md`.

Quick version: read `spec.md`, keep it client-side, run `pnpm exec svelte-check && pnpm build && pnpm lint` before opening a PR.

## Security

See `SECURITY.md` for the threat model and how to report vulnerabilities.

## Support the project

Showrunner is free under MIT. If it saves you time and you want to chip in toward continued development:

[![Buy Me a Coffee](https://img.shields.io/badge/Buy%20Me%20a%20Coffee-doziben-FFDD00?logo=buy-me-a-coffee&logoColor=000)](https://buymeacoffee.com/doziben)

Donations support development time and provider testing costs (every model swap eats real $$ to validate). They are not payments for support, features, or refunds — just a tip jar.

## Acknowledgments

- **Spec + first build:** Solomon Nwabuoku
- **Design language:** clones the [Flora AI](https://florafauna.ai) aesthetic — dark, mono, sans-only, content-forward
- **UI primitives:** [shadcn-svelte](https://www.shadcn-svelte.com) on top of [bits-ui](https://www.bits-ui.com)
- **Sample script:** the example in `spec.md` references [Postana](https://postana.app), the calendar agent

## License

MIT — see [`LICENSE`](./LICENSE).

You can use, modify, distribute, and commercialize this code. Keep the copyright notice in any copies or substantial portions you redistribute.