{"id":49003913,"url":"https://github.com/michellepace/docs-for-ai","last_synced_at":"2026-04-18T19:34:07.456Z","repository":{"id":318856246,"uuid":"1075307259","full_name":"michellepace/docs-for-ai","owner":"michellepace","description":"Curate and index clean docs for clean AI context to ask questions against docs.","archived":false,"fork":false,"pushed_at":"2026-02-04T23:54:53.000Z","size":3622,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-05T11:43:23.720Z","etag":null,"topics":["claude-code","context-management","curation","documentation","slash-commands"],"latest_commit_sha":null,"homepage":"","language":"MDX","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/michellepace.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-13T10:18:34.000Z","updated_at":"2026-02-04T23:54:57.000Z","dependencies_parsed_at":"2026-01-04T07:06:28.147Z","dependency_job_id":null,"html_url":"https://github.com/michellepace/docs-for-ai","commit_stats":null,"previous_names":["michellepace/docs-for-claude","michellepace/docs-for-ai"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/michellepace/docs-for-ai","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/michellepace%2Fdocs-for-ai","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/michellepace%2Fdocs-for-ai/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/michellepace%2Fdocs-for-ai/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/michellepace%2Fdocs-for-ai/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/michellepace","download_url":"https://codeload.github.com/michellepace/docs-for-ai/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/michellepace%2Fdocs-for-ai/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31982743,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-18T17:30:12.329Z","status":"ssl_error","status_checked_at":"2026-04-18T17:29:59.069Z","response_time":103,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["claude-code","context-management","curation","documentation","slash-commands"],"created_at":"2026-04-18T19:34:06.966Z","updated_at":"2026-04-18T19:34:07.439Z","avatar_url":"https://github.com/michellepace.png","language":"MDX","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Curate Docs For AI (with Claude Code)\n\nCurate and index documentation from any website into collections like `tailwind/`, `horses/`, etc. Reference collection indexes in your AI chats (e.g. `@tailwind/INDEX.xml what's a utility?`) so that only relevant docs are analysed. Much cleaner than a web-fetch and more focussed than a web-search. Keep your AI context sharp.\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"x_docs/images/example_usage.jpg\" alt=\"Terminal showing three-step workflow: (1) Running /curate-doc biome command, (2) Curation success output showing scraped documentation and generated INDEX.xml entry, (3) Use /ask-docs to query docs. Handwritten annotations highlight each step.\" width=\"940\"\u003e\n  \u003cp\u003e\u003cem\u003eComplete workflow: curate → auto scrape → \"/ask-docs biome Validate my config file please\"\u003c/em\u003e\u003c/p\u003e\n\u003c/div\u003e\n\n## 📦 Repo Collections\n\nAvailable collections in this repo:\n\n| Collection | Collection Index | Description | Scraped | Source |\n|:-----------|:-----------------|:------------|:--------|:-------|\n| 📦 [`biome/`](biome/) | 📄 [`biome/INDEX.xml`](biome/INDEX.xml) | Fast linter/formatter | 2025-11-04 | [Official](https://biomejs.dev) |\n| 📦 [`claudecode/`](claudecode/) | 📄 [`claudecode/INDEX.xml`](claudecode/INDEX.xml) | Anthropic Claude Code | 2026-02-05 | [Official](https://code.claude.com) |\n| 📦 [`claudeplat/`](claudeplat/) | 📄 [`claudeplat/INDEX.xml`](claudeplat/INDEX.xml) | Anthropic Claude Platform | 2026-01-07 | [Official](https://platform.claude.com) |\n| 📦 [`clerk/`](clerk/) | 📄 [`clerk/INDEX.xml`](clerk/INDEX.xml) | Authentication | 2025-12-03 | [Official](https://clerk.com) |\n| 📦 [`convex/`](convex/) | 📄 [`convex/INDEX.xml`](convex/INDEX.xml) | Reactive database | 2026-01-07 | [Official](https://docs.convex.dev) |\n| 🪝 [`lefthook/`](lefthook/) | 📄 [`lefthook/INDEX.xml`](lefthook/INDEX.xml) | Git hooks manager | 2025-11-24 | [Official](https://github.com/evilmartians/lefthook) |\n| 📦 [`marimo/`](marimo/) | 📄 [`marimo/INDEX.xml`](marimo/INDEX.xml) | Reactive Python notebooks | 2025-11-11 | [Official](https://docs.marimo.io) |\n| 📦 [`nextjs/`](nextjs/) | 📄 [`nextjs/INDEX.xml`](nextjs/INDEX.xml) | React framework | 2025-12-02 | [Official](https://nextjs.org) |\n| 📦 [`playwright/`](playwright/) | 📄 [`playwright/INDEX.xml`](playwright/INDEX.xml) | Browser testing | 2025-11-07 | [Official](https://playwright.dev) |\n| 📦 [`shadcn/`](shadcn/) | 📄 [`shadcn/INDEX.xml`](shadcn/INDEX.xml) | React UI components | 2025-12-16 | [Official](https://ui.shadcn.com), [Guide](https://shadcn.io) |\n| 📦 [`shiny/`](shiny/) | 📄 [`shiny/INDEX.xml`](shiny/INDEX.xml) | Python web apps | 2025-11-02 | [Official](https://shiny.posit.co/py/) |\n| 📦 [`tailwind/`](tailwind/) | 📄 [`tailwind/INDEX.xml`](tailwind/INDEX.xml) | CSS framework | 2025-10-15 | [Official](https://tailwindcss.com/docs/) |\n| 📦 [`tailwindplus/`](tailwindplus/) | 📄 [`tailwindplus/INDEX.xml`](tailwindplus/INDEX.xml) | Paid UI Components | 2025-11-16 | [Official](https://tailwindcss.com/plus) |\n| 📦 [`uv/`](uv/) | 📄 [`uv/INDEX.xml`](uv/INDEX.xml) | Python projects | 2026-01-16 | [Official](https://docs.astral.sh/uv/) |\n| 📦 [`vercel/`](vercel/) | 📄 [`vercel/INDEX.xml`](vercel/INDEX.xml) | Deployment platform | 2025-10-20 | [Official](https://vercel.com) |\n| 📦 [`vitest/`](vitest/) | 📄 [`vitest/INDEX.xml`](vitest/INDEX.xml) | Testing framework | 2025-11-05 | [Official](https://vitest.dev) |\n| 📦 [`zustand/`](zustand/) | 📄 [`zustand/INDEX.xml`](zustand/INDEX.xml) | State management | 2026-01-03 | [Official](https://zustand.docs.pmnd.rs) |\n\n*Curate your own collections. The [lefthook](lefthook/) collection is non-standard, docs directly downloaded from GitHub. For Anthropic docs use [this tool](https://github.com/ericbuess/claude-code-docs).*\n\n---\n\n## 🚀 Setup\n\n```bash\n# 1. Install UV\n# 👉 https://docs.astral.sh/uv/getting-started/installation/\n\n# 2. Clone repository\ngit clone https://github.com/michellepace/docs-for-ai.git\ncd docs-for-ai\n\n# 3. Get free FireCrawl API key\n# Visit: https://www.firecrawl.dev/app/api-keys\n\n# 4. Add to your shell profile\necho 'export API_KEY_MCP_FIRECRAWL=your-api-key-here' \u003e\u003e ~/.zshrc\nsource ~/.zshrc  # Use ~/.bashrc if that's your shell\n```\n\n## 📖 Usage via Slash Commands\n\n\u003e [!IMPORTANT]\n\u003e Edit the paths in [.claude/commands/ask-docs.md](.claude/commands/ask-docs.md) to match your local setup. To use from anywhere, move it to `~/.claude/commands/`.\n\n| Slash Command | Purpose | .md Files | INDEX `\u003csource\u003e` |\n|:--------|:--------|:----------|:----------|\n| `/curate-doc \u003ccollection\u003e \u003curl\u003e` | Add new or re-scrape | ✅ Write | ✅ Add/update INDEX.xml |\n| `/rescrape-docs \u003ccollection\u003e` | Re-scrape all docs | ✅ Write all | ✅ Selective update INDEX.xml |\n| `/improve-index-xml \u003ccollection\u003e` | Batch improve descriptions | 📖 Read | ✅ Update INDEX.xml |\n| `/ask-docs \u003ccollection\u003e \u003cquestion\u003e` | Query any collection | Docs analysed | Relevant docs identified |\n\n## 💡 Usage Example\n\nAssume tailwind was not already a collection in this repo:\n\n```bash\n# Start a new collection\n/curate-doc tailwind https://tailwindcss.com/docs/customizing-colors\n# → Creates tailwind/ collection directory, with README.md + INDEX.xml, and first curated doc\n\n# Re-scrape existing doc (refresh content from same URL)\n/curate-doc tailwind https://tailwindcss.com/docs/customizing-colors\n# → Re-scrapes, writes .md file, replaces source in INDEX.xml\n\n# Curate a new doc into collection\n/curate-doc tailwind https://tailwindcss.com/docs/styling-with-utility-classes\n# → Scrapes page into collection, writes .md file, adds source to INDEX.xml\n\n# Re-scrape all docs in collection\n/rescrape-docs tailwind\n# → Re-scrapes all URLs in INDEX.xml, writes all .md files, updates descriptions for changed content\n\n# ✨ Use the docs\n/ask-docs tailwind Please evaluate my project for correct usage of utility classes?\n# → Searches tailwind/INDEX.xml for relevant docs, analyses these, gives you an answer\n```\n\n## 🏗️ How This Repo Works\n\n**Workflow:** Python script scrapes URL → writes .md file → creates INDEX.xml entry with `PLACEHOLDER` description → Claude Code generates semantic description.\nThe `/curate-doc` command always regenerates the description, whereas `/rescrape-docs` only regenerates descriptions for files with content changes.\n\n**Directory Structure:**\n\n```text\nuv/\n├── INDEX.xml               # Index of all docs\n├── README.md\n├── api-reference.md        # Scraped doc\n├── getting-started.md      # Scraped doc\n└── ...\n```\n\n**INDEX.xml Schema:**\n\n```xml\n\u003cdocs_index\u003e\n  \u003csource\u003e\n    \u003ctitle\u003eHello Document Title\u003c/title\u003e\n    \u003cdescription\u003e20-30 word dense summary optimised for semantic search...\u003c/description\u003e\n    \u003csource_url\u003ehttps://docs.example.com/hello\u003c/source_url\u003e\n    \u003clocal_file\u003ehello-document-title.md\u003c/local_file\u003e\n    \u003cscraped_at\u003e2025-10-15\u003c/scraped_at\u003e\n  \u003c/source\u003e\n  \u003c!-- Multiple \u003csource\u003e entries, one per .md file --\u003e\n\u003c/docs_index\u003e\n```\n\nScripts use FireCrawl Python SDK. MCP server also configured ([.mcp.json](.mcp.json), [.claude/settings.json](.claude/settings.json)).\n\n---\n\n## 👉 Notes to Improve later\n\n### Old Idea\n\nInstead of crawling, rather go to GitHub and automate downloading and index creation. Docs are much cleaner than crawling. Keep .mdx files as-is; do not convert to .md. Trade-off: bulk downloads bloat the index; curating individually keeps focus.\n\n### New Idea (2026.01.16) — use `llms.txt` + direct fetch\n\nInstruction given to Claude Code and successfully run on `uv/` directory to update all documents via direct HTTP fetch (Python script), so no scraping, 100% clean, and no Firecrawl tokens.\n\n\u003cdiv align=\"center\"\u003e\n  \u003ca href=\"x_docs/images/20260116_better_idea.jpg\"\u003e\n    \u003cimg src=\"x_docs/images/20260116_better_idea.jpg\" alt=\"Claude Code terminal showing user prompt to assess llms.txt approach: explains that instead of FireCrawl scraping (which isn't always clean), match INDEX.xml source_url entries to llms.txt markdown URLs and curl content directly. Shows Claude reading README.md, uv/llms.txt, and uv/INDEX.xml files.\" width=\"875\"\u003e\n  \u003c/a\u003e\n  \u003cp\u003e\u003cem\u003eRefactor to use llms.txt + direct fetch\u003c/em\u003e\u003c/p\u003e\n\u003c/div\u003e\n\nAdding this as a note for later to refactor to this method. (The screenshot mentions `curl` but we used Python's `urllib.request`.)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmichellepace%2Fdocs-for-ai","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmichellepace%2Fdocs-for-ai","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmichellepace%2Fdocs-for-ai/lists"}