{"id":50454126,"url":"https://github.com/mizcausevic-dev/aeo-crawler","last_synced_at":"2026-06-01T01:05:40.337Z","repository":{"id":357724520,"uuid":"1236274043","full_name":"mizcausevic-dev/aeo-crawler","owner":"mizcausevic-dev","description":"BFS crawler for AEO Protocol v0.1 declaration graphs. Seed an origin, follow primary_source URIs, emit JSON Lines records of every fetch. Built on aeo-sdk-go. Concurrent, depth-limited, budget-capped, stdlib-only HTTP.","archived":false,"fork":false,"pushed_at":"2026-05-14T01:31:50.000Z","size":32,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-14T03:29:07.763Z","etag":null,"topics":["aeo","aeo-protocol","ai-governance","answer-engine-optimization","crawler","entity-graph","go-cli","golang","kinetic-gain-protocol-suite","protocol-implementation","well-known"],"latest_commit_sha":null,"homepage":"https://github.com/mizcausevic-dev/aeo-crawler","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mizcausevic-dev.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-12T05:23:22.000Z","updated_at":"2026-05-14T01:31:54.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/mizcausevic-dev/aeo-crawler","commit_stats":null,"previous_names":["mizcausevic-dev/aeo-crawler"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/mizcausevic-dev/aeo-crawler","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Faeo-crawler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Faeo-crawler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Faeo-crawler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Faeo-crawler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mizcausevic-dev","download_url":"https://codeload.github.com/mizcausevic-dev/aeo-crawler/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Faeo-crawler/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33755379,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-31T02:00:06.040Z","response_time":95,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aeo","aeo-protocol","ai-governance","answer-engine-optimization","crawler","entity-graph","go-cli","golang","kinetic-gain-protocol-suite","protocol-implementation","well-known"],"created_at":"2026-06-01T01:05:40.278Z","updated_at":"2026-06-01T01:05:40.332Z","avatar_url":"https://github.com/mizcausevic-dev.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# aeo-crawler\n\nA breadth-first crawler for the [AEO Protocol v0.1](https://github.com/mizcausevic-dev/aeo-protocol-spec).\n\nGive it one seed origin. It fetches that origin's `/.well-known/aeo.json`, then follows every `authority.primary_sources` URI as a candidate origin to fetch next — up to a configurable depth and total fetch budget. Output is one JSON Lines record per origin attempted, suitable for piping into `jq`, a graph database, or any analytics pipeline.\n\nBuilt on top of [aeo-sdk-go](https://github.com/mizcausevic-dev/aeo-sdk-go).\n\n## Install\n\n```bash\ngo install github.com/mizcausevic-dev/aeo-crawler/cmd/aeo-crawler@latest\n```\n\n## Usage\n\n```bash\naeo-crawler --seed https://mizcausevic-dev.github.io\n```\n\nOutput (one JSON object per line):\n\n```json\n{\"origin\":\"https://mizcausevic-dev.github.io\",\"depth\":0,\"success\":true,\"entity_name\":\"Miz Causevic\",\"entity_type\":\"Person\",\"claims_count\":6,\"audit_mode\":\"none\",\"fetched_at\":\"2026-05-12T04:00:00Z\"}\n{\"origin\":\"https://github.com\",\"depth\":1,\"success\":false,\"error\":\"HTTP 404\",\"fetched_at\":\"2026-05-12T04:00:01Z\"}\n{\"origin\":\"https://www.linkedin.com\",\"depth\":1,\"success\":false,\"error\":\"HTTP 404\",\"fetched_at\":\"2026-05-12T04:00:01Z\"}\n{\"origin\":\"https://mizcausevic.com\",\"depth\":1,\"success\":false,\"error\":\"HTTP 404\",\"fetched_at\":\"2026-05-12T04:00:01Z\"}\n```\n\n## Flags\n\n| Flag | Default | Description |\n|---|---|---|\n| `--seed` | required | Seed origin URL. |\n| `--depth` | `2` | Maximum graph distance from the seed. `0` = only fetch the seed. |\n| `--max-fetches` | `100` | Global cap on total fetches. |\n| `--concurrency` | `4` | Maximum in-flight HTTP requests. |\n| `--timeout` | `10` | Per-request timeout in seconds. |\n\n## Useful pipelines\n\n**Count successful AEO declarations:**\n```bash\naeo-crawler --seed https://mizcausevic-dev.github.io | jq -c 'select(.success==true)' | wc -l\n```\n\n**List unique entity names:**\n```bash\naeo-crawler --seed https://mizcausevic-dev.github.io | jq -r 'select(.success==true) | .entity_name' | sort -u\n```\n\n**Find origins that declare an `audit_mode` of `signature`:**\n```bash\naeo-crawler --seed https://example.com --depth 3 | jq -c 'select(.audit_mode==\"signature\")'\n```\n\n## How discovery works\n\nFor each fetched declaration, `authority.primary_sources` is treated as the source of next-hop candidate origins. Each URI is normalized to its scheme + host (path stripped). Already-visited origins are not re-fetched. The crawler does not currently chase `citation_preferences.canonical_links` or `claims[].evidence` — those are roadmap for v0.2.\n\n## Conformance\n\nOperates against AEO Protocol v0.1 declarations at **conformance Level 1 (Declare)**. Signature verification (L2) and audit-report submission (L3) are not invoked; signed documents are recorded as `audit_mode: \"signature\"` but not verified.\n\n## Dependencies\n\n- [github.com/mizcausevic-dev/aeo-sdk-go](https://github.com/mizcausevic-dev/aeo-sdk-go) — Go SDK for parsing and fetching AEO declarations\n- Go standard library (`net/http`, `encoding/json`, `context`, `sync`)\n\n## Development\n\n```bash\ngo vet ./...\ngo test -v ./...\ngo build ./cmd/aeo-crawler\n```\n\nTests use `httptest` to serve fixture AEO documents — no network is required.\n\n## Specification\n\nFull spec at [github.com/mizcausevic-dev/aeo-protocol-spec](https://github.com/mizcausevic-dev/aeo-protocol-spec).\n\n## License\n\nAGPL-3.0.\n\n## Kinetic Gain Protocol Suite\n\n| Spec | Implementation |\n|---|---|\n| [AEO Protocol](https://github.com/mizcausevic-dev/aeo-protocol-spec) | [aeo-sdk-python](https://github.com/mizcausevic-dev/aeo-sdk-python) · [aeo-sdk-typescript](https://github.com/mizcausevic-dev/aeo-sdk-typescript) · [aeo-sdk-rust](https://github.com/mizcausevic-dev/aeo-sdk-rust) · [aeo-sdk-go](https://github.com/mizcausevic-dev/aeo-sdk-go) · [aeo-cli](https://github.com/mizcausevic-dev/aeo-cli) · **aeo-crawler** (this) |\n| [Prompt Provenance](https://github.com/mizcausevic-dev/prompt-provenance-spec) | — |\n| [Agent Cards](https://github.com/mizcausevic-dev/agent-cards-spec) | — |\n| [AI Evidence Format](https://github.com/mizcausevic-dev/ai-evidence-format-spec) | — |\n| [MCP Tool Cards](https://github.com/mizcausevic-dev/mcp-tool-card-spec) | — |\n\n---\n\n**Connect:** [LinkedIn](https://www.linkedin.com/in/mirzacausevic/) · [Kinetic Gain](https://kineticgain.com) · [Medium](https://medium.com/@mizcausevic/) · [Skills](https://mizcausevic.com/skills/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmizcausevic-dev%2Faeo-crawler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmizcausevic-dev%2Faeo-crawler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmizcausevic-dev%2Faeo-crawler/lists"}