{"id":49172494,"url":"https://github.com/browser-use/video-use","last_synced_at":"2026-06-04T13:30:31.796Z","repository":{"id":353386309,"uuid":"1208221543","full_name":"browser-use/video-use","owner":"browser-use","description":"Edit videos with coding agents","archived":false,"fork":false,"pushed_at":"2026-05-10T19:06:11.000Z","size":558,"stargazers_count":7134,"open_issues_count":12,"forks_count":1023,"subscribers_count":54,"default_branch":"main","last_synced_at":"2026-05-10T21:10:43.825Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/browser-use.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-12T01:34:42.000Z","updated_at":"2026-05-10T21:00:00.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/browser-use/video-use","commit_stats":null,"previous_names":["browser-use/video-use"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/browser-use/video-use","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/browser-use%2Fvideo-use","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/browser-use%2Fvideo-use/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/browser-use%2Fvideo-use/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/browser-use%2Fvideo-use/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/browser-use","download_url":"https://codeload.github.com/browser-use/video-use/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/browser-use%2Fvideo-use/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33907693,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-04T02:00:06.755Z","response_time":64,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-04-22T20:00:44.251Z","updated_at":"2026-06-04T13:30:31.791Z","avatar_url":"https://github.com/browser-use.png","language":"Python","funding_links":[],"categories":["📋 Changelog","Python","Repos"],"sub_categories":["2026-07-12 — Weekly Update"],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"static/video-use-banner.png\" alt=\"video-use\" width=\"100%\"\u003e\n\u003c/p\u003e\n\n# video-use\n\nIntroducing **video-use** — edit videos with Claude Code. 100% open source.\n\nDrop raw footage in a folder, chat with Claude Code, get `final.mp4` back. Works for any content — talking heads, montages, tutorials, travel, interviews — without presets or menus.\n\n## What it does\n\n- **Cuts out filler words** (`umm`, `uh`, false starts) and dead space between takes\n- **Auto color grades** every segment (warm cinematic, neutral punch, or any custom ffmpeg chain)\n- **30ms audio fades** at every cut so you never hear a pop\n- **Burns subtitles** in your style — 2-word UPPERCASE chunks by default, fully customizable\n- **Generates animation overlays** via [HyperFrames](https://github.com/heygen-com/hyperframes), [Remotion](https://www.remotion.dev/), [Manim](https://www.manim.community/), or PIL — spawned in parallel sub-agents, one per animation\n- **Self-evaluates the rendered output** at every cut boundary before showing you anything\n- **Persists session memory** in `project.md` so next week's session picks up where you left off\n\n## Setup prompt\n\nPaste into Claude Code, Codex, Hermes, Openclaw, or any agent with shell access:\n\n```text\nSet up https://github.com/browser-use/video-use for me.\n\nRead install.md first to install this repo, wire up ffmpeg, register the skill with whichever agent you're running under, and set up the ElevenLabs API key — ask me to paste it when you need it. Then read SKILL.md for daily usage, and always read helpers/ because that's where the editing scripts live. After install, don't transcribe anything on your own — just tell me it's ready and wait for me to drop footage into a folder.\n```\n\nThe agent handles the clone, dependencies, skill registration, and prompts you once for your ElevenLabs API key (grab one at [elevenlabs.io/app/settings/api-keys](https://elevenlabs.io/app/settings/api-keys)).\n\nThen point your agent at a folder of raw takes:\n\n```bash\ncd /path/to/your/videos\nclaude    # or codex, hermes, etc.\n```\n\nFor always-on editing from your own VPS or Telegram, run the agent through [Browser Use Box](https://browser-use.com/bux). [Watch the 15-second demo](https://www.tiktok.com/@browser_use/video/7639824093721758989).\n\nAnd in the session:\n\n\u003e edit these into a launch video\n\nIt inventories the sources, proposes a strategy, waits for your OK, then produces `edit/final.mp4` next to your sources. All outputs live in `\u003cvideos_dir\u003e/edit/` — the skill directory stays clean.\n\n## Manual install\n\nIf you'd rather do it by hand:\n\n```bash\n# 1. Clone and symlink into your agent's skills directory\ngit clone https://github.com/browser-use/video-use ~/Developer/video-use\nln -sfn ~/Developer/video-use ~/.claude/skills/video-use        # Claude Code\n# ln -sfn ~/Developer/video-use ~/.codex/skills/video-use       # Codex\n\n# 2. Install deps\ncd ~/Developer/video-use\nuv sync                         # or: pip install -e .\nbrew install ffmpeg             # required\nbrew install yt-dlp             # optional, for downloading online sources\n\n# 3. Add your ElevenLabs API key\ncp .env.example .env\n$EDITOR .env                    # ELEVENLABS_API_KEY=...\n```\n\n## How it works\n\nThe LLM never watches the video. It **reads** it — through two layers that together give it everything it needs to cut with word-boundary precision.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"static/timeline-view.svg\" alt=\"timeline_view composite — filmstrip + speaker track + waveform + word labels + silence-gap cut candidates\" width=\"100%\"\u003e\n\u003c/p\u003e\n\n**Layer 1 — Audio transcript (always loaded).** One ElevenLabs Scribe call per source gives word-level timestamps, speaker diarization, and audio events (`(laughter)`, `(applause)`, `(sigh)`). All takes pack into a single ~12KB `takes_packed.md` — the LLM's primary reading view.\n\n```\n## C0103  (duration: 43.0s, 8 phrases)\n  [002.52-005.36] S0 Ninety percent of what a web agent does is completely wasted.\n  [006.08-006.74] S0 We fixed this.\n```\n\n**Layer 2 — Visual composite (on demand).** `timeline_view` produces a filmstrip + waveform + word labels PNG for any time range. Called only at decision points — ambiguous pauses, retake comparisons, cut-point sanity checks.\n\n\u003e Naive approach: 30,000 frames × 1,500 tokens = **45M tokens of noise**.\n\u003e Video Use: **12KB text + a handful of PNGs**.\n\nSame idea as browser-use giving an LLM a structured DOM instead of a screenshot — but for video.\n\n## Pipeline\n\n```\nTranscribe ──\u003e Pack ──\u003e LLM Reasons ──\u003e EDL ──\u003e Render ──\u003e Self-Eval\n                                                              │\n                                                              └─ issue? fix + re-render (max 3)\n```\n\nThe self-eval loop runs `timeline_view` on the _rendered output_ at every cut boundary — catches visual jumps, audio pops, hidden subtitles. You see the preview only after it passes.\n\n## Design principles\n\n1. **Text + on-demand visuals.** No frame-dumping. The transcript is the surface.\n2. **Audio is primary, visuals follow.** Cuts come from speech boundaries and silence gaps.\n3. **Ask → confirm → execute → self-eval → persist.** Never touch the cut without strategy approval.\n4. **Zero assumptions about content type.** Look, ask, then edit.\n5. **12 hard rules, artistic freedom elsewhere.** Production-correctness is non-negotiable. Taste isn't.\n\nSee [`SKILL.md`](./SKILL.md) for the full production rules and editing craft.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrowser-use%2Fvideo-use","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbrowser-use%2Fvideo-use","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrowser-use%2Fvideo-use/lists"}