{"id":47711260,"url":"https://github.com/kryptobaseddev/openclawdocs","last_synced_at":"2026-04-02T18:32:39.138Z","repository":{"id":347725669,"uuid":"1195069514","full_name":"kryptobaseddev/openclawdocs","owner":"kryptobaseddev","description":"LLM-agent-optimized CLI for syncing, searching, and browsing OpenClaw documentation locally","archived":false,"fork":false,"pushed_at":"2026-03-29T08:29:46.000Z","size":148,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-29T10:11:14.609Z","etag":null,"topics":["ai-agent","cli","documentation","fts5","llm","openclaw","python"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kryptobaseddev.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"ko_fi":"H2H815OTBU"}},"created_at":"2026-03-29T07:08:39.000Z","updated_at":"2026-03-29T08:29:09.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/kryptobaseddev/openclawdocs","commit_stats":null,"previous_names":["kryptobaseddev/openclawdocs"],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/kryptobaseddev/openclawdocs","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kryptobaseddev%2Fopenclawdocs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kryptobaseddev%2Fopenclawdocs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kryptobaseddev%2Fopenclawdocs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kryptobaseddev%2Fopenclawdocs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kryptobaseddev","download_url":"https://codeload.github.com/kryptobaseddev/openclawdocs/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kryptobaseddev%2Fopenclawdocs/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31312917,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-02T12:59:32.332Z","status":"ssl_error","status_checked_at":"2026-04-02T12:54:48.875Z","response_time":89,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agent","cli","documentation","fts5","llm","openclaw","python"],"created_at":"2026-04-02T18:32:36.717Z","updated_at":"2026-04-02T18:32:39.125Z","avatar_url":"https://github.com/kryptobaseddev.png","language":"Python","funding_links":["https://ko-fi.com/H2H815OTBU"],"categories":[],"sub_categories":[],"readme":"# ocdocs (openclawdocs)\n\n**A CLI built for LLM agents, not humans.** Syncs, indexes, and searches the entire [OpenClaw](https://openclaw.ai) documentation locally so AI agents can look things up instead of hallucinating them.\n\n[![npm](https://img.shields.io/npm/v/openclawdocs)](https://www.npmjs.com/package/openclawdocs)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Ko-fi](https://img.shields.io/badge/Ko--fi-Support-ff5e5b?logo=ko-fi)](https://ko-fi.com/H2H815OTBU)\n\n---\n\n## Why I Built This\n\nI'm building [CleoClaw](https://github.com/kryptobaseddev/cleoclaw) — a Mission Control frontend for OpenClaw instances. Every time I ask an AI agent to implement something against the OpenClaw API, it confidently makes up endpoint behavior, invents config keys that don't exist, and hallucinates RPC methods. Context7, training data, web search — none of them reliably reflect what's *actually* in the [OpenClaw docs](https://docs.openclaw.ai) right now.\n\nSo I built a tool that downloads the entire doc site, indexes it locally with SQLite FTS5, and lets agents query it with minimal token cost. Now when my agent needs to know how Discord integration works, it spends 150 tokens reading the summary instead of 10,000 tokens guessing wrong.\n\n**This is purpose-built for OpenClaw documentation.** It's not a generic doc scraper. It knows about OpenClaw's `llms-full.txt` dump, Mintlify page structure, and the specific categories that matter when you're building on top of OpenClaw.\n\n**Humans can use it too** (the output is perfectly readable, as you'll see below), but every design decision optimizes for agent consumption: structured output, minimal tokens, progressive disclosure.\n\n## What It Looks Like\n\n### Browse all 370 topics across 19 categories\n\n```bash\nocdocs list\n```\n\n![ocdocs list](docs/screenshots/list.png)\n\n### Search with FTS5 BM25 ranking + fuzzy matching\n\n```bash\nocdocs search discord\n```\n\n![ocdocs search](docs/screenshots/search.png)\n\n### Drill into any topic (summary first, full content on demand)\n\n```bash\nocdocs show channels/discord\n```\n\n```\n# Discord\nCategory: channels | 4,797 words | https://docs.openclaw.ai/channels/discord\n\nStatus: ready for DMs and guild channels via the official Discord gateway.\n\n## Sections (17)\n- Quick setup\n- Recommended: Set up a guild workspace\n- Runtime model\n- Forum channels\n- Interactive components\n- Access control and routing\n- Developer Portal setup\n- Native commands and command auth\n- Feature details\n- Tools and action gates\n- Components v2 UI\n- Voice channels\n- Voice messages\n- Troubleshooting\n- Configuration reference pointers\n- Safety and operations\n- Related\n\n\u003e openclaw-docs show channels/discord --full\n```\n\n### Pull just the section you need\n\n```bash\nocdocs show channels/discord -s \"quick setup\"\n```\n\n```\n# Discord \u003e Quick setup\nCategory: channels | Section of 4,797 word topic\n\nYou will need to create a new application with a bot, add the bot to your\nserver, and pair it to OpenClaw. We recommend adding your bot to your own\nprivate server...\n```\n\n### Diff local docs against the live site\n\n```bash\nocdocs diff\n```\n\n```\nChecking remote...\nNo changes detected. Local docs are up to date.\n```\n\n### Sync pulls everything down (idempotent)\n\n```bash\nocdocs sync\n```\n\n```\nSyncing documentation...\nSync complete.\n  Added:     0\n  Updated:   0\n  Removed:   0\n  Unchanged: 370\n  Total:     370\n```\n\nFirst run downloads 370 topics. Subsequent runs use ETag caching and content hashing — only fetches what changed.\n\n---\n\n## Install\n\n### npm (recommended)\n\n```bash\nnpm install -g openclawdocs\nocdocs sync\n```\n\nRequires Python 3.12+. The postinstall script creates a Python venv and installs everything automatically.\n\nThree aliases work interchangeably: `ocdocs`, `openclawdocs`, `openclaw-docs`\n\n### pip (manual)\n\n```bash\ngit clone https://github.com/kryptobaseddev/openclawdocs.git\ncd openclawdocs\npython3 -m venv .venv\nsource .venv/bin/activate\npip install -e .\nocdocs sync\n```\n\n---\n\n## The Token Economy\n\nEvery command is designed around **Minimum Viable Information (MVI)** — agents get exactly what they need, nothing more. Each level costs proportionally more tokens:\n\n```\nLevel 0: ocdocs search \"discord\"                        →   120 tokens  (find the page)\nLevel 1: ocdocs show channels/discord                   →   150 tokens  (summary + sections)\nLevel 2: ocdocs show channels/discord -s \"quick setup\"  → 1,000 tokens  (just that section)\nLevel 3: ocdocs show channels/discord --full            → 10,800 tokens (everything)\n```\n\nMost agent tasks finish at Level 1 or 2. The `--full` dump is a last resort. That's a **70-90x token reduction** over blindly dumping entire doc pages into context.\n\n| Command | Purpose | Token Cost |\n|---------|---------|------------|\n| `search \u003cquery\u003e` | FTS5 BM25 + fuzzy search | ~120 tokens |\n| `search \u003cquery\u003e -v` | Search with content snippets | ~300 tokens |\n| `show \u003cpath\u003e` | Topic summary + section list | ~150 tokens |\n| `show \u003cpath\u003e -s \u003cname\u003e` | Single section content | ~500-1500 tokens |\n| `show \u003cpath\u003e --full` | Complete topic content | ~2K-15K tokens |\n| `list` | All categories with counts | ~100 tokens |\n| `list -c \u003ccategory\u003e` | Topics within a category | ~200 tokens |\n| `sync` | Download/update docs (idempotent) | N/A |\n| `sync --force` | Force re-download, bypass cache | N/A |\n| `status` | Sync health check | ~50 tokens |\n| `diff` | Compare local vs remote changes | ~100 tokens |\n\n---\n\n## Agent Integration\n\n### Claude Code / CLAUDE.md\n\nDrop this in your `CLAUDE.md` and your agent stops guessing:\n\n```markdown\nWhen working with OpenClaw features, ALWAYS verify behavior against local docs:\n  ocdocs search \"\u003ctopic\u003e\"\n  ocdocs show \u003cpath\u003e\n  ocdocs show \u003cpath\u003e --section \"\u003csection name\u003e\"\nNever assume OpenClaw API behavior from training data.\n```\n\n### Agent Workflow Pattern\n\n```\n1. ocdocs status          — synced? (run ocdocs sync if not)\n2. ocdocs search \"TOPIC\"  — find the right page\n3. ocdocs show PATH       — read summary + section list\n4. ocdocs show PATH -s X  — read the specific section needed\n5. Only use --full as last resort\n```\n\n### Programmatic (Python)\n\n```python\nfrom openclaw_docs.storage import DocsStorage\nfrom openclaw_docs.search import SearchEngine\nfrom openclaw_docs.config import get_db_path, get_topics_dir\n\nstorage = DocsStorage(get_db_path())\nengine = SearchEngine(storage, get_topics_dir())\nresults = engine.search(\"gateway authentication\", limit=5)\n```\n\n---\n\n## Architecture\n\n```\ndocs.openclaw.ai\n       |\n       +-- llms-full.txt ----\u003e parser.py (section split) ----\u003e 362 topics\n       |                                                            |\n       +-- llms.txt ----------\u003e parser.py (index) ----------\u003e index entries\n       |                                                            |\n       +-- /page HTML --------\u003e trafilatura.extract() --------\u003e  8 topics\n                                                                    |\n                                                        +-----------+\n                                                        v\n                                                 SQLite + FTS5\n                                          (content-sync + triggers)\n                                                        |\n                                             +----------+----------+\n                                             v          v          v\n                                          search      show       list\n                                         (BM25 +    (PD L1-3)  (categories)\n                                          fuzzy)\n```\n\n- **Primary source**: `llms-full.txt` — pre-formatted markdown dump from OpenClaw (362 pages)\n- **Gap filler**: trafilatura extracts ~8 pages missing from the dump (direct HTML scrape)\n- **Search**: SQLite FTS5 with BM25 ranking + rapidfuzz fuzzy title matching\n- **Parsing**: markdown-it-py AST for section extraction (no fragile regex)\n- **Storage**: SQLite with FTS5 external content table, auto-synced via triggers\n- **Data location**: OS-standard paths via platformdirs (`~/.local/share/openclawdocs` on Linux)\n\n### Coverage\n\n**370 topics** across **19 categories**. 100% of published English content on docs.openclaw.ai. 13 pages listed in the site navigation return HTTP 404 (unpublished experiments/plans) and are automatically skipped.\n\n---\n\n## Built For CleoClaw\n\nI'm actively using `ocdocs` while building [CleoClaw](https://github.com/kryptobaseddev/cleoclaw) — a Mission Control single-pane-of-glass frontend for managing OpenClaw instances. CleoClaw talks to OpenClaw's gateway API, manages agents, handles board chat, and needs to get the API behavior right. `ocdocs` is how my AI agents (and I) verify what the docs actually say before writing integration code.\n\nIf you're building anything on top of OpenClaw, this tool exists so your agents stop making things up.\n\n---\n\n## Configuration\n\n| Env Variable | Purpose |\n|---|---|\n| `OPENCLAW_DOCS_DATA_DIR` | Override data storage directory |\n\nDefault data locations (via [platformdirs](https://github.com/platformdirs/platformdirs)):\n- **Linux**: `~/.local/share/openclawdocs`\n- **macOS**: `~/Library/Application Support/openclawdocs`\n- **Windows**: `C:\\Users\\\u003cuser\u003e\\AppData\\Local\\openclawdocs`\n\n---\n\n## Contributing\n\nI'm a solo dev on this (well, me and my AI sidekick who definitely wrote more of this codebase than I did). If you have ideas for improvements, find bugs, or want to extend the tool — PRs and issues are welcome.\n\n**Things I'm actively looking for:**\n- Better search ranking strategies for doc-specific queries\n- Smarter section extraction for Mintlify's custom MDX components\n- MCP server integration so agents can query docs without shelling out\n- Ideas for reducing first-sync time\n- Anything that makes this more useful for agents building on OpenClaw\n\n### Setup\n\n1. Fork and clone\n2. `python3 -m venv .venv \u0026\u0026 source .venv/bin/activate \u0026\u0026 pip install -e .`\n3. Make changes\n4. `ocdocs sync \u0026\u0026 ocdocs search \"test query\"` to verify\n5. Create a changeset: `npm run changeset`\n6. Submit a PR\n\n### Versioning\n\n[CalVer](https://calver.org/) (`YYYY.M.MICRO`) enforced by [VersionGuard](https://github.com/kryptobaseddev/versionguard). Change management via [Changesets](https://github.com/changesets/changesets).\n\n### Code Style\n\n- Python: ruff (`ruff check src/`)\n- All CLI output must be machine-parseable, not prose\n- If a human can't read it, that's fine. If an agent can't parse it, that's a bug.\n\n---\n\n## Support\n\nIf this saves your agents from hallucinating OpenClaw APIs, consider buying me a coffee.\n\n[![Ko-fi](https://ko-fi.com/img/githubbutton_sm.svg)](https://ko-fi.com/H2H815OTBU)\n\n## License\n\n[MIT](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkryptobaseddev%2Fopenclawdocs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkryptobaseddev%2Fopenclawdocs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkryptobaseddev%2Fopenclawdocs/lists"}