{"id":31534028,"url":"https://github.com/tazztone/subject-frame-extractor","last_synced_at":"2026-05-16T11:32:49.006Z","repository":{"id":315490489,"uuid":"1059705014","full_name":"tazztone/subject-frame-extractor","owner":"tazztone","description":" extracting, analyzing, and filtering frames from video files or YouTube links","archived":false,"fork":false,"pushed_at":"2026-05-08T22:35:36.000Z","size":15510,"stargazers_count":2,"open_issues_count":12,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-08T23:38:08.556Z","etag":null,"topics":["ffmpeg","opencv","pyiqa","sam3","ytdlp"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tazztone.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2025-09-18T20:24:57.000Z","updated_at":"2026-05-08T21:58:21.000Z","dependencies_parsed_at":null,"dependency_job_id":"bb0fe39b-2e41-4df6-a8f5-82a400fda88c","html_url":"https://github.com/tazztone/subject-frame-extractor","commit_stats":null,"previous_names":["tazztone/subject-frame-extractor"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/tazztone/subject-frame-extractor","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tazztone%2Fsubject-frame-extractor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tazztone%2Fsubject-frame-extractor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tazztone%2Fsubject-frame-extractor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tazztone%2Fsubject-frame-extractor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tazztone","download_url":"https://codeload.github.com/tazztone/subject-frame-extractor/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tazztone%2Fsubject-frame-extractor/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33100847,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-16T04:41:52.686Z","status":"ssl_error","status_checked_at":"2026-05-16T04:41:52.009Z","response_time":115,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ffmpeg","opencv","pyiqa","sam3","ytdlp"],"created_at":"2025-10-04T05:16:22.836Z","updated_at":"2026-05-16T11:32:49.000Z","avatar_url":"https://github.com/tazztone.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Subject Frame Extractor\n\n[![Python](https://img.shields.io/badge/python-3.10%2B-blue)](https://python.org)\n[![PyTorch](https://img.shields.io/badge/PyTorch-CUDA-ee4c2c)](https://pytorch.org/)\n[![Gradio](https://img.shields.io/badge/UI-Gradio%206.x-ff5000)](https://gradio.app/)\n[![License: MIT](https://img.shields.io/badge/license-MIT-yellow)](LICENSE)\n\nAn AI-powered tool for extracting, analyzing, and filtering high-quality frames from video footage. Designed for dataset builders (LoRA / Dreambooth training), content creators, and researchers who need curated image sets from raw video — not just raw frame dumps.\n\nAlso includes a **Photo Culling** mode for scoring and rating RAW photo libraries.\n\n## Tech Stack\n\n| Layer | Technology |\n|---|---|\n| Runtime | Python 3.10+ (3.12 recommended) |\n| UI | Gradio 6.x |\n| Segmentation | SAM 3 (Segment Anything Model 3, Facebook Research) |\n| Object detection | YOLO (80 COCO classes) |\n| Face analysis | InsightFace (similarity matching, blink detection, head pose) |\n| Quality scoring | NIQE (perceptual), Laplacian variance (sharpness), pHash + LPIPS (dedup) |\n| Video / media | FFmpeg, yt-dlp |\n| RAW processing | ExifTool (embedded preview extraction — no demosaicing) |\n| Data | PyTorch, NumPy, OpenCV, SQLite, Pydantic |\n| Dependency management | uv |\n\n## Architecture\n\n```\n/\n├── app.py                  # Gradio UI entry point\n├── cli.py                  # Headless CLI (extract / analyze / full / status / photo)\n├── core/\n│   ├── config.py           # Full configuration schema (Pydantic)\n│   ├── extractor.py        # Extraction strategies (keyframe, interval, scene, Nth)\n│   ├── analyzer.py         # AI analysis pipeline (SAM seeding, tracking, metrics)\n│   ├── tracker.py          # Subject tracking across scenes\n│   ├── face.py             # InsightFace integration\n│   ├── quality.py          # NIQE, sharpness, entropy, LPIPS scoring\n│   ├── dedup.py            # pHash + LPIPS deduplication\n│   ├── photo.py            # Photo culling: RAW ingest, scoring, XMP sidecar export\n│   └── database.py         # SQLite session metadata\n├── SAM3_repo/              # SAM 3 submodule\n└── scripts/\n    ├── linux_run_app.sh\n    └── setup scripts\n```\n\n**Pipeline:** Extract frames → scene segmentation → AI seeding (face ref / text / YOLO) → SAM 3 propagation → quality metrics → interactive filtering → AR-aware crop export.\n\n## Key Features\n\n- **Extraction strategies** — keyframes, fixed intervals, scene-based, every Nth frame; YouTube URL support\n- **Multi-class tracking** — find and track any of 80 COCO objects via YOLO + SAM 3; open-vocabulary text descriptions\n- **Face matching** — find every frame of a specific person using InsightFace reference photo\n- **Quality filtering** — interactive sliders for sharpness, contrast, NIQE perceptual score\n- **Smart deduplication** — pHash + LPIPS removes near-identical frames per scene\n- **AR-aware export** — subject-centred crops in 1:1, 9:16, 16:9, or custom ratios\n- **Photo culling mode** — RAW preview extraction (CR2, NEF, ARW, DNG, ORF…), AI scoring, export to Lightroom/Capture One XMP sidecar star ratings\n\n## Quick Start\n\n**Prerequisites:** Python 3.10+, FFmpeg in PATH, CUDA GPU recommended (~8 GB VRAM for SAM 3)\n\n```bash\ngit clone --recursive https://github.com/tazztone/subject-frame-extractor.git\ncd subject-frame-extractor\nuv sync\n\n# Launch Gradio UI\nuv run python app.py\n# → http://127.0.0.1:7860\n```\n\n## CLI Usage\n\n```bash\n# Extract frames\nuv run python cli.py extract --video video.mp4 --output ./results --nth-frame 10\n\n# Run AI analysis (with face reference)\nuv run python cli.py analyze --session ./results --video video.mp4 --face-ref person.png --resume\n\n# Full pipeline in one command\nuv run python cli.py full --video video.mp4 --output ./results --face-ref person.png\n\n# Photo culling workflow\nuv run python cli.py photo ingest --folder /path/to/raws --output ./photo_session\nuv run python cli.py photo score --session ./photo_session\nuv run python cli.py photo export --session ./photo_session   # → XMP sidecars\n```\n\n## Configuration\n\nSee `core/config.py` for the full Pydantic schema. Key settings:\n\n| Category | Key Fields | Default |\n|---|---|---|\n| Paths | `logs_dir`, `models_dir`, `downloads_dir` | `logs`, `models`, `downloads` |\n| Models | `face_model_name`, `tracker_model_name` | `buffalo_l`, `sam3` |\n| Performance | `analysis_default_workers`, `cache_size` | `4`, `200` |\n\nSee [AGENTS.md](AGENTS.md) for architecture details, critical rules, and development guidelines.\n\n## License\n\nMIT — see [LICENSE](LICENSE).\n\n---\n\n## Technical Debt \u0026 Roadmap\n\nThis project uses a semi-automated TODO tracking system to prioritize refactors and features.\n\n- **Check Current Debt**: Run `uv run python scripts/generate_todo_report.py` to generate `TODO_REPORT.md`.\n- **Top 20 Summary**:\n    1.  [High] Refactor `core/pipelines.py` to use modular `core/managers`. (In Progress)\n    2.  [High] Implement thread-safe model access for InsightFace.\n    3.  [Medium] Add temporal consistency smoothing between frames in `MaskPropagator`.\n    4.  [Medium] Add adaptive quality thresholds based on propagation distance.\n    5.  [Low] Support demosaicing for RAW photo ingest (currently uses previews).\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftazztone%2Fsubject-frame-extractor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftazztone%2Fsubject-frame-extractor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftazztone%2Fsubject-frame-extractor/lists"}