{"id":50337842,"url":"https://github.com/openclaw/photoscrawl","last_synced_at":"2026-05-29T15:00:33.574Z","repository":{"id":361023725,"uuid":"1252720351","full_name":"openclaw/photoscrawl","owner":"openclaw","description":"it crawls your apple photos library and feeds it to your agent","archived":false,"fork":false,"pushed_at":"2026-05-28T22:45:47.000Z","size":84,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-28T23:13:59.055Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/openclaw.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null},"funding":{"github":["moltbot"]}},"created_at":"2026-05-28T19:57:57.000Z","updated_at":"2026-05-28T22:45:50.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/openclaw/photoscrawl","commit_stats":null,"previous_names":["openclaw/photoscrawl"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/openclaw/photoscrawl","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openclaw%2Fphotoscrawl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openclaw%2Fphotoscrawl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openclaw%2Fphotoscrawl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openclaw%2Fphotoscrawl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/openclaw","download_url":"https://codeload.github.com/openclaw/photoscrawl/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openclaw%2Fphotoscrawl/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33657690,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-29T02:00:06.066Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-29T15:00:20.871Z","updated_at":"2026-05-29T15:00:33.561Z","avatar_url":"https://github.com/openclaw.png","language":"Go","funding_links":["https://github.com/sponsors/moltbot"],"categories":[],"sub_categories":[],"readme":"# photoscrawl\n\nLocal-first Apple Photos crawler for the OpenClaw crawl-family ecosystem.\n\n`photoscrawl` builds a `photos.sqlite` archive from a user's Photos library. The\ngoal is not photo backup. The goal is to help users understand their own library:\nwhere photos were taken, when they were taken, what is visible, which\ndocuments/screenshots/receipts exist, which assets belong together, and what\nevidence supports each result.\n\n## Principles\n\n- Go product code only.\n- Use `github.com/openclaw/crawlkit` for shared crawler mechanics.\n- Local-first by default; no cloud model calls unless the user explicitly selects\n  assets or derivatives to send.\n- Read-only Photos access. Never write back to Photos.\n- Snapshot before crawling live library state.\n- Metadata for all assets, local classification for high-signal coverage.\n- Store observations and evidence, not final people/trip/place truth.\n\n## First Commands\n\n```sh\ngo run ./cmd/photoscrawl init --json\ngo run ./cmd/photoscrawl status --json\ngo run ./cmd/photoscrawl crawl --library \"$HOME/Pictures/Photos Library.photoslibrary\" --json\ngo run ./cmd/photoscrawl classify --limit 100 --json\ngo run ./cmd/photoscrawl classify --local-model gemma4:e4b --limit 20 --json\ngo run ./cmd/photoscrawl search --query \"drone beach portugal\" --json\ngo run ./cmd/photoscrawl open --id asset:\u003cid\u003e --json\ngo run ./cmd/photoscrawl neighbors --id asset:\u003cid\u003e --json\ngo run ./cmd/photoscrawl evidence --row-id asset:\u003cid\u003e --json\n```\n\nPlanned crawl-family commands:\n\n```sh\nphotoscrawl export --format lifecrawler --json\n```\n\n`crawl` tries PhotoKit first for metadata. PhotoKit enumerates the active system\nPhotos library; the `--library` path is validated and recorded as the requested\nsource. If PhotoKit is unavailable or denied, the POC falls back to a read-only\n`database/Photos.sqlite` transaction and labels that evidence as\n`photos_sqlite_snapshot`.\n\n`crawl` does not export originals or force iCloud downloads. It records already\nlocal package media paths for derivatives/renders/originals when they exist, so\ncontent classification can use local files without changing Photos or iCloud\nstate. Every imported asset is queued for `classify`.\n\n`classify` drains that queue into evidence-backed local metadata observations.\nWith `--local-model \u003collama-model\u003e`, it also sends already-local image bytes to a\nlocal Ollama vision model and stores typed candidate observations:\nscene summaries, visible-text summaries, place-type/name/venue candidates,\nobjects/foods, anonymous people presence, privacy hints, cluster terms, and\nuncertainties. These are evidence-backed model observations, not durable\npeople/place/trip truth.\n\n`neighbors` returns source-level adjacent assets only. It does not create trips,\npeople, places, or clusters. Current reasons are deterministic archive facts:\nsame burst id, same album id, same resource hash, nearby creation time, nearby\nraw GPS, and shared local observation labels.\n\n## Current Useful Output\n\nToday the POC sees useful source facts and optional local multimodal observations:\n\n- asset timing, media type, dimensions, favorite/hidden state, timezone, and\n  burst metadata;\n- resource type, UTI, filename, local/remote availability, iCloud download need,\n  and resource hash when already local;\n- album membership and raw GPS observations with evidence refs;\n- metadata-only observations for media type, local content availability,\n  geometry, burst membership, resource UTI/type, and weak\n  screenshot/document/receipt candidates from filenames, albums, and metadata;\n- optional local model observations from already-local image derivatives or\n  originals, plus normalized terms for search and later clustering;\n- quality observations for model failures such as prompt leakage;\n- status coverage counts for GPS, observations, local resources, remote\n  resources, classification queue state, and observation types;\n- search/open/evidence/neighbors JSON that points every claim back to source\n  rows or evidence ids.\n\nIt does not create durable identities, trips, places, relationships, embeddings,\nor global clusters yet.\n\n## Why This Shape\n\nThis is a local-first personal media index:\n\n- typed local objects;\n- provenance on every derived claim;\n- entity and link resolution as explainable pipelines;\n- graph traversal and timelines as first-class query shapes;\n- clusters and trips as later hypotheses, not v1 truth;\n- user-owned local archive with no sharing or hidden scoring by default.\n\nPhotos are useful because a saved image usually records something the user cared\nabout: a place, person, document, trip, purchase, home, event, hobby, meal,\nscreenshot, or drone flight. The crawler's job is to preserve that context\nwithout pretending GPS, face labels, or classifier labels are perfect facts.\n\n## v1 Scope\n\nBuild `photos.sqlite` with:\n\n- assets and resource metadata from Apple Photos;\n- local original-download queue with bounded cache/ringbuffer;\n- GPS observations as raw coordinates only;\n- album membership;\n- file/resource hashes when originals are available;\n- Vision/Core ML observations: labels, OCR, faces, barcodes, screenshot/document\n  markers, quality/similarity signals where useful;\n- evidence refs for every observation;\n- JSON status/search/open/neighbors/evidence commands.\n\nOut of scope for v1:\n\n- durable person identity;\n- durable trip/place/event truth;\n- relationship inference;\n- global photo clustering;\n- cloud classification by default;\n- Photos writeback.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenclaw%2Fphotoscrawl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopenclaw%2Fphotoscrawl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenclaw%2Fphotoscrawl/lists"}