{"id":50660093,"url":"https://github.com/poonamsnair/clipsheet","last_synced_at":"2026-06-08T02:01:02.728Z","repository":{"id":354390167,"uuid":"1223337563","full_name":"poonamsnair/clipsheet","owner":"poonamsnair","description":"Turn any video into images Claude, Codex, Gemini or a Cursor agent can read    ","archived":false,"fork":false,"pushed_at":"2026-04-28T10:25:17.000Z","size":7611,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-28T12:10:19.905Z","etag":null,"topics":["ai-agents","claude-cli","cli","codex-cli","ffmeg","gemini-cli","python","video-ai-tools","video-processing"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/poonamsnair.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-28T08:20:27.000Z","updated_at":"2026-04-28T10:25:21.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/poonamsnair/clipsheet","commit_stats":null,"previous_names":["poonamsnair/clipsheet"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/poonamsnair/clipsheet","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poonamsnair%2Fclipsheet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poonamsnair%2Fclipsheet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poonamsnair%2Fclipsheet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poonamsnair%2Fclipsheet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/poonamsnair","download_url":"https://codeload.github.com/poonamsnair/clipsheet/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poonamsnair%2Fclipsheet/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34044919,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-08T02:00:07.615Z","response_time":111,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","claude-cli","cli","codex-cli","ffmeg","gemini-cli","python","video-ai-tools","video-processing"],"created_at":"2026-06-08T02:00:42.350Z","updated_at":"2026-06-08T02:01:02.701Z","avatar_url":"https://github.com/poonamsnair.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"examples/logo.png\" alt=\"clipsheet logo\" width=\"200\"\u003e\n\u003c/p\u003e\n\n# clipsheet\n\n\u003e **Turn any video into images your AI agent can read.**\n\nPlaywright, browser-use, multimodal LLM video analysis, and native video APIs are slow — often minutes per run, expensive, and overkill when you just need to see what happened on screen. clipsheet converts any video into a handful of annotated grid images that any vision-capable model can read in one pass. Record a screen, drop in a clip, hand it a product demo — if it's a video, clipsheet can process it.\n\n[![PyPI](https://img.shields.io/pypi/v/clipsheet.svg)](https://pypi.org/project/clipsheet/)\n[![Python](https://img.shields.io/pypi/pyversions/clipsheet.svg)](https://pypi.org/project/clipsheet/)\n[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)\n\n![A 20-second agent UI recording turned into a readable 3x3 grid, with timestamps and cell labels in each cell](examples/demo-grid.jpg)\n\n![Claude analyzing the grid and identifying 4 bugs — Vertex AI quota error, misleading recovery prompt, raw JSON leak, and intent mismatch](examples/demo-errors.svg)\n\nAny video → 2-4 grid images → one model call. Process multiple videos at once. CPU-only, no GPU, no audio, no API keys. Best for videos under 5 minutes — beyond that, consider [Gemini native video](https://ai.google.dev/gemini-api/docs/video) or [Twelve Labs](https://twelvelabs.io).\n\n---\n\n## Why clipsheet?\n\n| Approach | Time for a 2-min recording | Cost | What the agent sees |\n|---|---|---|---|\n| **Playwright / browser-use** | 2+ minutes (real-time) | Compute + browser | Screenshots you scripted |\n| **Gemini native video** | 30-60s upload + processing | ~$0.02-0.10/min video | Every frame (slow, expensive) |\n| **Any video + clipsheet** | 1-2 seconds processing | Free (CPU-only) | Deduplicated keyframes with timestamps |\n\n---\n\n## 30-second start\n\n```bash\npip install clipsheet\nclipsheet recording.mp4\n```\n\nOutput lands in `recording_clips/` next to your video:\n\n```\nrecording_clips/\n  grid_01.jpg       3×3 mosaic, cells labeled A1..C3, timestamps burned in\n  grid_02.jpg       next 9 frames in time order\n  manifest.json     maps each cell back to its source timestamp\n```\n\n![clipsheet in action — processing a screen recording into grid images](examples/demo1.gif)\n\n---\n\n## Use it from your coding agent\n\n### Install the skill (one time)\n\n```bash\nclipsheet init\n# or: npx skills add poonamsnair/clipsheet\n```\n\n### Then just talk to your agent\n\n**Claude Code:**\n```\n\u003e /clipsheet review this video for bugs: ~/Downloads/bug-repro.mp4\n\u003e /clipsheet what errors do you see in these two recordings: flow1.mp4 flow2.mp4\n```\n\n**Cursor:**\n```\n\u003e /clipsheet debug this flow: recording.mp4\n```\n\n**Codex CLI:**\n```\n\u003e $clipsheet what UI states appear in this recording: session.mp4\n```\n\n**Any agent with shell access** (no skill needed):\n```\n\u003e run clipsheet on recording.mp4 and tell me what went wrong\n```\n\n### Real-world examples\n\n**Debug agentic applications — see how users interact with your agent's UI:**\n```\n\u003e /clipsheet ~/Downloads/agent-session.mp4\n\u003e the chat layer is clashing with the sidebar — what's happening at each step?\n```\n\n**Review short-form content — get feedback on hooks, pacing, and visual elements:**\n```\n\u003e /clipsheet ~/Desktop/reel-draft.mp4\n\u003e rate the hook, suggest a better opening, and write the transcript\n```\n\n**Debug web animations and 3D components:**\n```\n\u003e /clipsheet ~/Desktop/animation-bug.mov\n\u003e the CSS transition is janky between 0:03 and 0:05 — what's the state at each frame?\n```\n\n**Compare working vs broken flows:**\n```\n\u003e /clipsheet ~/Desktop/checkout-working.mov ~/Desktop/checkout-broken.mov\n\u003e what's different between these two?\n```\n\n**Batch-review multiple recordings:**\n```\n\u003e /clipsheet bug1.mp4 bug2.mp4 bug3.mp4\n\u003e list every issue you see across all three\n```\n\n---\n\n## CLI reference\n\n### Process videos\n\n```bash\nclipsheet \u003cvideo\u003e [video2 ...] [options]\n```\n\n| Option                  | Default | Description                                                |\n|-------------------------|---------|------------------------------------------------------------|\n| `-o, --output \u003cdir\u003e`    | `\u003cvideo\u003e_clips/` | Output directory. Auto-created next to each input. Override with `-o` or `CLIPSHEET_OUTPUT_DIR` env var. |\n| `--grid \u003cRxC\u003e`          | `3x3`   | Cell layout. `2x3` for larger/more readable cells, `4x4` for dense recordings. |\n| `--max-grids \u003cn\u003e`       | `4`     | Cap on grid images. Bump for videos \u003e 8 minutes.           |\n| `--fps \u003cn\u003e`             | `4`     | Sample rate in fps. Higher values catch more transitions but take longer. |\n| `--keep-intermediate`   | `false` | Keep `_raw/` and `_cells/` for debugging.                  |\n| `--json`                | `false` | Emit a JSON summary on stdout (for piping to `jq`).        |\n| `--pretty`              | `false` | Pretty-print JSON (only with `--json`).                    |\n| `-v, --verbose`         | `false` | Show sampling details and frame counts.                    |\n\nExamples:\n\n```bash\nclipsheet recording.mp4                          # output → recording_clips/\nclipsheet bug1.mp4 bug2.mp4 bug3.mp4             # process multiple videos\nclipsheet bug1.mp4 bug2.mp4 -o ./all-bugs        # all outputs into one directory\nclipsheet recording.mp4 --grid 2x3               # larger cells for readable text\nclipsheet animation-bug.mp4 --fps 8              # catch fast UI transitions\n```\n\n### Other commands\n\n```bash\nclipsheet init                    # install skill into detected agents\nclipsheet init --agent \u003cname\u003e     # scope to specific agents (repeatable)\nclipsheet init --force            # overwrite existing skill installs\n\nclipsheet --status                # version, ffmpeg, agents, recent runs\nclipsheet --version               # short version string\nclipsheet --help                  # full help\n```\n\n---\n\n## Install\n\n```bash\npip install clipsheet\n# or: uv tool install clipsheet\n# or: pipx install clipsheet\n```\n\nffmpeg is bundled. No separate install needed.\n\n---\n\n## What it does NOT do\n\n- **No audio transcription.** Use Whisper if you need the soundtrack.\n- **No video editing, trimming, or transcoding.** Different tool category.\n- **No GPU.** CPU-only by design, for portability.\n\nWorks on any video format ffmpeg can read — MP4, MOV, HEVC, WebM, MKV, AVI, and more. When *not* to use clipsheet: if you need frame-by-frame motion analysis, audio understanding, or real-time video streaming, use Gemini 2.5 native video or Twelve Labs.\n\n## Performance\n\nclipsheet processing times on a 2024 M-series Mac (CPU only):\n\n| Video                    | Duration | Grids | clipsheet |\n|--------------------------|----------|-------|-----------|\n| Agent UI screen recording | 21s      | 2     | \u003c1s       |\n| Product demo              | 41s      | 4     | ~2s       |\n| Product demo              | 58s      | 4     | ~1s       |\n| YouTube video (1080p)     | 69s      | 4     | ~1s       |\n| Presentation              | 2 min    | 2     | ~2s       |\n| Presentation              | 3.3 min  | 4     | ~11s      |\n| Screen recording (HEVC)   | 4.9 min  | 4     | ~14s      |\n\n**Where does the time go?** clipsheet itself is fast — under 2 seconds for most videos under 2 minutes. When using it through an agent (Claude, Gemini, etc.), most of the wait is the model reading the grid images and generating a response, not clipsheet processing. A typical loop: ~1s clipsheet + ~5s image reading + ~10s response = ~15-20s total.\n\nRequires macOS 10.15+, Linux, or Windows 10+. Python 3.10+.\n\n## License\n\nMIT. See [LICENSE](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpoonamsnair%2Fclipsheet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpoonamsnair%2Fclipsheet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpoonamsnair%2Fclipsheet/lists"}