{"id":50216335,"url":"https://github.com/rogerchappel/crawldeck","last_synced_at":"2026-05-26T09:04:36.868Z","repository":{"id":358361569,"uuid":"1231743133","full_name":"rogerchappel/crawldeck","owner":"rogerchappel","description":"Local-first crawl job deck for fixture-backed queues, health, and crawler adapter seams.","archived":false,"fork":false,"pushed_at":"2026-05-17T01:16:57.000Z","size":44,"stargazers_count":1,"open_issues_count":2,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-17T02:47:29.582Z","etag":null,"topics":["agent-tools","cli","crawler","local-first","queue","typescript"],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rogerchappel.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-05-07T08:41:53.000Z","updated_at":"2026-05-17T01:17:01.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/rogerchappel/crawldeck","commit_stats":null,"previous_names":["rogerchappel/crawldeck"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/rogerchappel/crawldeck","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rogerchappel%2Fcrawldeck","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rogerchappel%2Fcrawldeck/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rogerchappel%2Fcrawldeck/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rogerchappel%2Fcrawldeck/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rogerchappel","download_url":"https://codeload.github.com/rogerchappel/crawldeck/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rogerchappel%2Fcrawldeck/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33512343,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T03:12:49.672Z","status":"ssl_error","status_checked_at":"2026-05-26T03:12:47.976Z","response_time":63,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent-tools","cli","crawler","local-first","queue","typescript"],"created_at":"2026-05-26T09:04:15.331Z","updated_at":"2026-05-26T09:04:36.858Z","avatar_url":"https://github.com/rogerchappel.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# crawldeck\n\nA local-first crawl job deck for agents and developers who want to queue, pause, inspect, and report on crawl work without surprise network calls.\n\ncrawldeck is inspired by the useful control-plane shape of [CrawlBar](https://github.com/vincentkoc/CrawlBar), but it is a fresh TypeScript CLI implementation with different branding, scope, and code. V1 deliberately starts as a fixture-backed CLI rather than a macOS menu bar app, so it is testable, deterministic, and safe for agent workflows.\n\n## Why this exists\n\nCrawlers tend to become invisible background magic. crawldeck makes the boring control surface explicit:\n\n- profiles describe what can be crawled\n- jobs live in local JSON queue files\n- health shows queue depth and failures\n- reports summarize what happened\n- adapter seams leave room for real crawlers later\n\nNo telemetry. No credentials. No external crawl network calls by default.\n\n## Install\n\n```bash\nnpm install\nnpm run build\nnpm link\n```\n\nOr run directly from a checkout:\n\n```bash\nnode dist/cli.js --help\n```\n\n## Quickstart\n\n```bash\nnpm install\nnpm run build\ncrawldeck init\ncrawldeck profile add sample --fixture ./fixtures/sample-site\ncrawldeck job enqueue sample\ncrawldeck job list\ncrawldeck inspect sample\ncrawldeck job pause \u003cjob-id\u003e\ncrawldeck job resume \u003cjob-id\u003e\ncrawldeck job start \u003cjob-id\u003e\ncrawldeck health\ncrawldeck report\n```\n\nThe sample fixture includes a 404 on purpose, so the started job demonstrates failure reporting.\n\n## Commands\n\n```text\ncrawldeck init\ncrawldeck adapters\ncrawldeck profile add \u003cname\u003e --fixture \u003cpath\u003e [--out \u003cdir\u003e]\ncrawldeck profile list\ncrawldeck inspect \u003cprofile\u003e\ncrawldeck job enqueue \u003cprofile\u003e\ncrawldeck job list\ncrawldeck job next\ncrawldeck job status \u003cjob-id\u003e\ncrawldeck job start \u003cjob-id\u003e\ncrawldeck job pause \u003cjob-id\u003e\ncrawldeck job resume \u003cjob-id\u003e\ncrawldeck job complete \u003cjob-id\u003e\ncrawldeck health\ncrawldeck report [--json]\n```\n\n## Local state\n\nBy default crawldeck writes only under:\n\n- `.crawldeck/queue.json`\n- `.crawldeck/out/\u003cjob-id\u003e/...`\n\nUse `--deck-dir \u003cdir\u003e` to put the queue somewhere else.\n\n## Adapter seam\n\nThe built-in adapter is `fixture`. Future real crawler adapters can register through the library seam:\n\n```js\nimport { adapterSeam } from 'crawldeck';\n\nadapterSeam('my-crawler', () =\u003e ({\n  name: 'my-crawler',\n  async inspect(profile) { return []; },\n  async run(profile, job) { return { totalItems: 0, processedItems: 0, errors: [], reportPath: '' }; }\n}));\n```\n\nReal adapters should be explicit about network access, robots.txt behavior, rate limits, and credential use.\n\n## Verification\n\n```bash\nnpm test\nnpm run check\nnpm run build\nnpm run smoke\nbash scripts/validate.sh\n```\n\n## Safety and privacy\n\n- Local-first queue and reports.\n- Fixture-backed by default.\n- No hidden telemetry or analytics.\n- No credential scraping or secret storage.\n- No publishing or external crawling unless a future adapter explicitly implements it.\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frogerchappel%2Fcrawldeck","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frogerchappel%2Fcrawldeck","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frogerchappel%2Fcrawldeck/lists"}