An open API service indexing awesome lists of open source software.

https://github.com/pratiyush/llm-wiki

LLM-powered knowledge base from your Claude Code, Codex CLI, Copilot, Cursor & Gemini sessions. Karpathy's LLM Wiki pattern β€” implemented and shipped.
https://github.com/pratiyush/llm-wiki

ai claude-code cli codex-cli copilot cursor developer-tools gemini-cli karpathy knowledge-base llm llm-wiki markdown mcp obsidian open-source python session-history static-site wiki

Last synced: 18 days ago
JSON representation

LLM-powered knowledge base from your Claude Code, Codex CLI, Copilot, Cursor & Gemini sessions. Karpathy's LLM Wiki pattern β€” implemented and shipped.

Awesome Lists containing this project

README

          

# llmwiki

> **LLM-powered knowledge base from your Claude Code, Codex CLI, Cursor, Gemini CLI, and Obsidian sessions.**
> Built on [Andrej Karpathy's LLM Wiki pattern](https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f).

## πŸ‘‰ Live demo: **[pratiyush.github.io/llm-wiki](https://pratiyush.github.io/llm-wiki/)**

Rebuilt on every `master` push from the synthetic sessions in [`examples/demo-sessions/`](examples/demo-sessions). No personal data. Shows every feature of the real tool (activity heatmap, tool charts, token usage, model info cards, vs-comparisons, project topics) running against safe reference data.

[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/)
[![Version](https://img.shields.io/badge/version-v1.2.3-10B981.svg)](CHANGELOG.md)
[![Tests](https://img.shields.io/badge/tests-2068%20passing-10B981.svg)](tests/)
[![CI](https://github.com/Pratiyush/llm-wiki/actions/workflows/ci.yml/badge.svg?branch=master)](https://github.com/Pratiyush/llm-wiki/actions/workflows/ci.yml)
[![Link check](https://github.com/Pratiyush/llm-wiki/actions/workflows/link-check.yml/badge.svg?branch=master)](https://github.com/Pratiyush/llm-wiki/actions/workflows/link-check.yml)
[![Wiki checks](https://github.com/Pratiyush/llm-wiki/actions/workflows/wiki-checks.yml/badge.svg?branch=master)](https://github.com/Pratiyush/llm-wiki/actions/workflows/wiki-checks.yml)
[![Docker](https://github.com/Pratiyush/llm-wiki/actions/workflows/docker-publish.yml/badge.svg)](https://github.com/Pratiyush/llm-wiki/pkgs/container/llm-wiki)
[![Works with Claude Code](https://img.shields.io/badge/Claude%20Code-βœ“-7C3AED.svg)](https://claude.com/claude-code)
[![Works with Codex CLI](https://img.shields.io/badge/Codex%20CLI-βœ“-7C3AED.svg)](https://github.com/openai/codex)
[![Works with Copilot](https://img.shields.io/badge/GitHub%20Copilot-βœ“-7C3AED.svg)](https://github.com/features/copilot)
[![Works with Cursor](https://img.shields.io/badge/Cursor-βœ“-7C3AED.svg)](https://cursor.com)
[![Works with Gemini CLI](https://img.shields.io/badge/Gemini%20CLI-βœ“-7C3AED.svg)](https://ai.google.dev/gemini-api)
[![Works with Obsidian](https://img.shields.io/badge/Obsidian-βœ“-7C3AED.svg)](https://obsidian.md)

---

Every Claude Code, Codex CLI, Copilot, Cursor, and Gemini CLI session writes a full transcript to disk. You already have hundreds of them and never look at them again.

**llmwiki** turns that dormant history into a beautiful, searchable, interlinked knowledge base β€” locally, in two commands. Plus, it produces AI-consumable exports (`llms.txt`, `llms-full.txt`, JSON-LD graph, per-page `.txt` + `.json` siblings) so other AI agents can query your wiki directly.

```bash
./setup.sh # one-time install
./build.sh && ./serve.sh # build + serve at http://127.0.0.1:8765
```

**Contributing in one line:** read [`CONTRIBUTING.md`](CONTRIBUTING.md), keep PRs focused (one concern each), use `feat:` / `fix:` / `docs:` / `chore:` / `test:` commit prefixes, never commit real session data (`raw/` is gitignored), no new runtime deps. CI must be green to merge.

## Screenshots

All screenshots below are from the **public demo site** which is built on every `master` push from the dummy example sessions. Your own wiki will look identical β€” just with your real work.

### Home β€” projects overview with activity heatmap
![llmwiki home page β€” LLM Wiki header, activity heatmap, and a grid of three demo projects (demo-blog-engine, demo-ml-pipeline, demo-todo-api)](docs/images/home.png)

### All sessions β€” filterable table across every project
![llmwiki sessions index β€” activity timeline above a table of eight demo sessions with project, model, date, message count, and tool-call columns](docs/images/sessions.png)

### Session detail β€” full conversation + tool calls
![llmwiki session detail β€” Rust blog engine scaffolding session showing summary, breadcrumbs, a TOML Cargo.toml block and a Rust main.rs block, both highlighted by highlight.js](docs/images/session-rust.png)

### Changelog β€” renders `CHANGELOG.md` as a first-class page
![llmwiki changelog page β€” keep-a-changelog format with colored headings for Added / Fixed / Changed and auto-linked PR references](docs/images/changelog.png)

### Projects index β€” freshness badges + per-project stats
![llmwiki projects index β€” three demo project cards with green/yellow/red freshness badges showing how recently each project was touched](docs/images/projects.png)

## What you get

### Human-readable
- **All your sessions**, converted from `.jsonl` to clean, redacted markdown
- **A Karpathy-style wiki** β€” `sources/`, `entities/`, `concepts/`, `syntheses/`, `comparisons/`, `questions/` linked with `[[wikilinks]]`
- **A beautiful static site** you can browse locally or deploy to GitHub Pages
- Global search (Cmd+K command palette with fuzzy match over pre-built index)
- [highlight.js](https://highlightjs.org/) client-side syntax highlighting (light + dark themes)
- Dark mode (system-aware + manual toggle with `data-theme`)
- Keyboard shortcuts: `/` search Β· `g h/p/s` nav Β· `j/k` rows Β· `?` help
- Collapsible tool-result sections (auto-expand > 500 chars)
- Copy-as-markdown + copy-code buttons
- Breadcrumbs + reading progress bar
- Filter bar on sessions table (project/model/date/text)
- Reading time estimates (`X min read`)
- Related pages panel at the bottom of every session
- Activity heatmap on the home page
- Model info cards with structured schema (provider, pricing, benchmarks)
- Auto-generated vs-comparison pages between AI models
- Append-only changelog timeline with pricing sparkline
- Project topic chips (GitHub-style tags on project cards)
- Agent labels (colored badges: Claude/Codex/Copilot/Cursor/Gemini)
- Recently-updated card on the home page
- Dataview-style structured queries in the command palette
- Hover-to-preview wikilinks
- Deep-link icons next to every heading
- Mobile-responsive + print-friendly

### AI-consumable (v0.4)
Every HTML page has sibling machine-readable files at the same URL:

- `.html` β€” human HTML with schema.org microdata
- `.txt` β€” plain text version (no HTML tags)
- `.json` β€” structured metadata + body

Site-level AI-agent entry points:

| File | What |
|---|---|
| [`/llms.txt`](https://llmstxt.org) | Short index per [llmstxt.org spec](https://llmstxt.org) |
| `/llms-full.txt` | Flattened plain-text dump (~5 MB cap) β€” paste into any LLM's context |
| `/graph.jsonld` | Schema.org JSON-LD entity/concept/source graph |
| `/sitemap.xml` | Standard sitemap with `lastmod` |
| `/rss.xml` | RSS 2.0 feed of newest sessions |
| `/robots.txt` | AI-friendly robots with llms.txt reference |
| `/ai-readme.md` | AI-specific navigation instructions |
| `/manifest.json` | Build manifest with SHA-256 hashes + perf budget |

Every page also includes an `` HTML comment that AI agents can parse without fetching the separate `.json` sibling.

### Quality & governance (v1.0)
- **4-factor confidence scoring** β€” source count, source quality, recency, cross-references; with Ebbinghaus-inspired decay per content-type
- **5-state lifecycle machine** β€” draft β†’ reviewed β†’ verified β†’ stale β†’ archived with 90-day auto-stale
- **16 lint rules** β€” 8 structural (frontmatter, link integrity, orphans, freshness, duplicates, index sync…) + 3 LLM-powered (contradictions, claim verification, summary accuracy) + stale_candidates (#51) + tags_topics_convention (#302) + stale_reference_detection (#303) + frontmatter_count_consistency (#378) + tools_consistency (#378)
- **Auto Dream** β€” MEMORY.md consolidation after 24h + 5 sessions: resolve relative dates, prune outdated, 200-line cap
- **9 navigation files** β€” CLAUDE.md, AGENTS.md, MEMORY.md, SOUL.md, CRITICAL_FACTS.md, hints.md, hot.md + per-project hot caches

### Obsidian-native experience (v1.0)
- **`link-obsidian` CLI** β€” symlinks the whole project into an Obsidian vault; graph view + backlinks + full-text search just work
- **Dataview dashboard** β€” 10 ready-to-use queries (recently updated, by confidence, by lifecycle, by project, by entity type, open questions, stale pages)
- **Templater templates** β€” 4 templates for source/entity/concept/synthesis pages, seeded with confidence + lifecycle + today's date
- **Category pages** β€” tag-based index pages in both Dataview (Obsidian) and static markdown (HTML) modes
- **Integration guide** β€” [`docs/obsidian-integration.md`](docs/obsidian-integration.md) covers 6 recommended plugins with per-plugin configs

### Automation
- **SessionStart hook** β€” auto-syncs new sessions in the background on every Claude Code launch
- **Auto-build on sync** β€” `/wiki-sync` triggers `/wiki-build` (configurable; default on)
- **One-shot pipeline** β€” `llmwiki all` runs build β†’ graph β†’ export β†’ lint in a single command (`--strict` for CI)
- **MCP server** β€” 12 production tools (query, search, list, read, lint, sync, export, + confidence, lifecycle, dashboard, entity search, category browse) queryable from any MCP client (Claude Desktop, Cline, Cursor, ChatGPT desktop)
- **Pending ingest queue** β€” SessionStart hook converts + queues; `/wiki-sync` processes queue
- **No servers, no database, no npm** β€” Python stdlib + `markdown`. Syntax highlighting loads from a highlight.js CDN at view time.

## How it works

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ ~/.claude/projects/*/*.jsonl β”‚ ← Claude Code sessions
β”‚ ~/.codex/sessions/**/*.jsonl β”‚ ← Codex CLI sessions
β”‚ ~/Library/.../Cursor/workspaceS… β”‚ ← Cursor
β”‚ ~/Documents/Obsidian Vault/ β”‚ ← Obsidian
β”‚ ~/.gemini/ β”‚ ← Gemini CLI
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό python3 -m llmwiki sync
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ raw/sessions// β”‚ ← immutable markdown (Karpathy layer 1)
β”‚ 2026-04-08-.md β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό /wiki-ingest (your coding agent)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ wiki/sources/.md β”‚ ← LLM-generated wiki (Karpathy layer 2)
β”‚ wiki/entities/.md β”‚
β”‚ wiki/concepts/.md β”‚
β”‚ wiki/syntheses/.md β”‚
β”‚ wiki/comparisons/.md β”‚
β”‚ wiki/questions/.md β”‚
β”‚ wiki/index.md, overview.md, log.md β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό python3 -m llmwiki build
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ site/ β”‚ ← static HTML + AI exports
β”‚ β”œβ”€β”€ index.html, style.css, ... β”‚
β”‚ β”œβ”€β”€ sessions//.html β”‚
β”‚ β”œβ”€β”€ sessions//.txt β”‚ (AI sibling)
β”‚ β”œβ”€β”€ sessions//.json β”‚ (AI sibling)
β”‚ β”œβ”€β”€ llms.txt, llms-full.txt β”‚
β”‚ β”œβ”€β”€ graph.jsonld β”‚
β”‚ β”œβ”€β”€ sitemap.xml, rss.xml β”‚
β”‚ β”œβ”€β”€ robots.txt, ai-readme.md β”‚
β”‚ β”œβ”€β”€ manifest.json β”‚
β”‚ └── search-index.json β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

See [docs/architecture.md](docs/architecture.md) for the full 3-layer Karpathy + 8-layer build breakdown.

## Documentation

Full production documentation lives under [`docs/`](docs/). The editorial
hub is **[`docs/index.md`](docs/index.md)** β€” tutorials, per-agent guides,
reference, and deployment, all in one place.

**Start here:**

| Goal | Read |
|---|---|
| Install and build your first site in 10 minutes | [Tutorial 01 β†’ 02](docs/tutorials/01-installation.md) |
| Use llmwiki with Claude Code | [Tutorial 03](docs/tutorials/03-use-with-claude-code.md) |
| Use llmwiki with Codex CLI | [Tutorial 04](docs/tutorials/04-use-with-codex-cli.md) |
| Query / lint / review your wiki daily | [Tutorial 05](docs/tutorials/05-querying-your-wiki.md) |
| Point llmwiki at an existing Obsidian / Logseq vault | [Tutorial 06](docs/tutorials/06-bring-your-obsidian-vault.md) |
| See four real end-to-end workflows | [Tutorial 07](docs/tutorials/07-example-workflows.md) |

Contributing to docs? See the **[style guide](docs/style-guide.md)**.

## Install

### macOS / Linux

```bash
git clone https://github.com/Pratiyush/llm-wiki.git
cd llm-wiki
./setup.sh
```

### Windows

```cmd
git clone https://github.com/Pratiyush/llm-wiki.git
cd llm-wiki
setup.bat
```

### With pip (v0.3+)

```bash
pip install -e . # basic β€” everything you need
pip install -e '.[pdf]' # + PDF ingestion
pip install -e '.[dev]' # + pytest + ruff
pip install -e '.[all]' # all of the above
```

Syntax highlighting is now powered by [highlight.js](https://highlightjs.org/), loaded from a CDN at view time β€” no optional deps required.

### What setup does

1. Creates `raw/`, `wiki/`, `site/` data directories
2. Installs the `llmwiki` Python package in-place
3. Detects your coding agents and enables matching adapters
4. Optionally offers to install the `SessionStart` hook into `~/.claude/settings.json` for auto-sync
5. Runs a first sync so you see output immediately

## For maintainers

Running the project? The governance scaffold lives under [`docs/maintainers/`](docs/maintainers) and is loaded by a dedicated skill:

| File | What it's for |
|---|---|
| [`CONTRIBUTING.md`](CONTRIBUTING.md) | Short rules for contributors β€” read this first |
| [`CODE_OF_CONDUCT.md`](CODE_OF_CONDUCT.md) | Contributor Covenant 2.1 |
| [`SECURITY.md`](SECURITY.md) | Disclosure process for redaction bugs, XSS, data leaks |
| [`docs/maintainers/ARCHITECTURE.md`](docs/maintainers/ARCHITECTURE.md) | One-page system diagram + layer boundaries + what NOT to add |
| [`docs/maintainers/REVIEW_CHECKLIST.md`](docs/maintainers/REVIEW_CHECKLIST.md) | Canonical code-review criteria |
| [`docs/maintainers/RELEASE_PROCESS.md`](docs/maintainers/RELEASE_PROCESS.md) | Version bump β†’ CHANGELOG β†’ tag β†’ build β†’ publish |
| [`docs/maintainers/TRIAGE.md`](docs/maintainers/TRIAGE.md) | Label taxonomy + stale-issue policy |
| [`docs/maintainers/ROADMAP.md`](docs/maintainers/ROADMAP.md) | Near-term plan + release themes |
| [`docs/maintainers/DECLINED.md`](docs/maintainers/DECLINED.md) | Graveyard of declined ideas with reasons |

Four Claude Code slash commands automate the common ops:

- `/review-pr ` β€” apply the REVIEW_CHECKLIST to a PR and post findings
- `/triage-issue ` β€” label + milestone + priority a new issue
- `/release ` β€” walk the release process step by step
- `/maintainer` β€” meta-skill that loads every governance doc as context

## Running E2E tests

The unit suite (`pytest tests/` β€” 472 tests) runs in milliseconds and
covers every module. The **end-to-end suite** under `tests/e2e/` is
separate: it builds a minimal demo site, serves it on a random port,
drives a real browser via [Playwright](https://playwright.dev/python),
and runs scenarios written in [Gherkin](https://cucumber.io/docs/gherkin/)
via [pytest-bdd](https://pytest-bdd.readthedocs.io/).

Why both? Unit tests lock the contract at the module boundary;
E2E locks the contract at the **user's browser**. A diff that passes
unit tests but breaks the Cmd+K palette will fail E2E.

Install the extras (one-time, ~300 MB for Chromium):

```bash
pip install -e '.[e2e]'
python -m playwright install chromium
```

Run the suite:

```bash
pytest tests/e2e/ --browser=chromium
```

Run a single feature:

```bash
pytest tests/e2e/test_command_palette.py --browser=chromium -v
```

The E2E suite is **excluded from the default `pytest tests/` run**
(see the `--ignore=tests/e2e` addopt in `pyproject.toml`) so you
can iterate on the unit suite without waiting for browser installs.
CI runs the E2E job as a separate workflow (`.github/workflows/e2e.yml`)
that only fires on PRs touching `build.py`, the viz modules, or
`tests/e2e/**`.

Feature files live under `tests/e2e/features/` β€” one per UI area
(homepage, session page, command palette, keyboard nav, mobile nav,
theme toggle, copy-as-markdown, **responsive breakpoints**, **edge
cases**, **accessibility**, **visual regression**). Step definitions
are all in `tests/e2e/steps/ui_steps.py`. Adding a new scenario is
usually a 2-line change to a `.feature` file plus maybe one new step.

Run locally with an HTML report:

```bash
pytest tests/e2e/ --browser=chromium \
--html=e2e-report/index.html --self-contained-html
open e2e-report/index.html # macOS β€” opens the browseable report
```

**Where to see test reports:**

| What | Where |
|---|---|
| Unit test results | GitHub Actions β†’ `ci.yml` β†’ latest run β†’ `lint-and-test` job logs |
| E2E HTML report | GitHub Actions β†’ `e2e.yml` β†’ latest run β†’ Artifacts β†’ `e2e-html-report` (14-day retention) |
| Visual regression screenshots | Same run β†’ Artifacts β†’ `e2e-screenshots` |
| Playwright traces (failed runs only) | Same run β†’ Artifacts β†’ `playwright-traces` (open with `playwright show-trace `) |
| Demo site deploy status | GitHub Actions β†’ `pages.yml` β†’ latest run |

Locally, the HTML report is one file (`e2e-report/index.html`) that
you can open in any browser β€” pass/fail per scenario, duration,
stdout/stderr, screenshot on failure.

## Scheduled sync

For a daily / weekly cron-style sync, schedule `llmwiki sync` directly via your OS's native job runner (`launchd` on macOS, `systemd` on Linux, Task Scheduler on Windows). Paths and adapter selection come from `examples/sessions_config.json`.

## CLI reference

```bash
llmwiki init # scaffold raw/ wiki/ site/ + seed nav files
llmwiki sync # convert .jsonl β†’ markdown (auto-build + auto-lint if configured)
llmwiki build # compile static HTML + AI exports
llmwiki serve # local HTTP server on 127.0.0.1:8765
llmwiki adapters # list available adapters + configured state (v1.0)
llmwiki graph # build knowledge graph (v0.2)
llmwiki watch # file watcher with debounce (v0.2)
llmwiki export-obsidian # write wiki to Obsidian vault (v0.2)
llmwiki lint # 16-rule wiki lint (v1.2)
llmwiki export # AI-consumable exports (v0.4)
llmwiki synthesize # auto-ingest synthesis pipeline (v0.5)
llmwiki all # build β†’ graph β†’ export β†’ lint in one shot (v1.2)
llmwiki version
```

Each subcommand has its own `--help`. All commands are also wrapped in one-click shell/batch scripts: `sync.sh`/`.bat`, `build.sh`/`.bat`, `serve.sh`/`.bat`, `upgrade.sh`/`.bat`.

## Works with

| Agent | Adapter | Status | Added in |
|---|---|---|---|
| [Claude Code](https://claude.com/claude-code) | `llmwiki.adapters.claude_code` | βœ… Production | v0.1 |
| [Obsidian](https://obsidian.md) (input) | `llmwiki.adapters.obsidian` | βœ… Production | v0.1 |
| [Obsidian](https://obsidian.md) (output) | `llmwiki.obsidian_output` | βœ… Production | v0.2 |
| [Codex CLI](https://github.com/openai/codex) | `llmwiki.adapters.codex_cli` | βœ… Production | v0.3 |
| [Cursor](https://cursor.com) | `llmwiki.adapters.cursor` | βœ… Production | v0.5 |
| [Gemini CLI](https://ai.google.dev/gemini-api) | `llmwiki.adapters.gemini_cli` | βœ… Production | v0.5 |
| PDF files | `llmwiki.adapters.pdf` | βœ… Production | v0.5 |
| [Copilot Chat](https://github.com/features/copilot) | `llmwiki.adapters.copilot_chat` | βœ… Production | v0.9 |
| [Copilot CLI](https://github.com/features/copilot) | `llmwiki.adapters.copilot_cli` | βœ… Production | v0.9 |
| OpenCode / OpenClaw | β€” | ⏸ Deferred | β€” |

Adding a new agent is [one small file](docs/framework.md) β€” subclass `BaseAdapter`, declare `SUPPORTED_SCHEMA_VERSIONS`, ship a fixture + snapshot test.

## MCP server

llmwiki ships its own MCP server (stdio transport, no SDK dependency) so any MCP client can query your wiki directly.

```bash
python3 -m llmwiki.mcp # runs on stdin/stdout
```

Twelve production tools (7 core + 5 added in v1.0 `#159`):

| Tool | What |
|---|---|
| `wiki_query(question, max_pages)` | Keyword search + page content (no LLM synthesis) |
| `wiki_search(term, include_raw)` | Raw grep over wiki/ (+ optional raw/) |
| `wiki_list_sources(project)` | List raw source files with metadata |
| `wiki_read_page(path)` | Read one page (path-traversal guarded) |
| `wiki_lint()` | Orphans + broken-wikilinks report |
| `wiki_sync(dry_run)` | Trigger the converter |
| `wiki_export(format)` | Return any AI-consumable export (llms.txt, jsonld, sitemap, rss, manifest) |
| `wiki_confidence(min, max)` | Pages by confidence range (v1.0) |
| `wiki_lifecycle(state)` | Pages by draft/reviewed/verified/stale/archived (v1.0) |
| `wiki_dashboard()` | Health summary: counts by type, lifecycle, confidence (v1.0) |
| `wiki_entity_search(name, entity_type)` | Search entities by name substring or type (v1.0) |
| `wiki_category_browse(tag)` | Browse tags with counts, drill into specific tag (v1.0) |

Register in your MCP client's config β€” e.g. for Claude Desktop, add to `~/Library/Application Support/Claude/claude_desktop_config.json`:

```json
{
"mcpServers": {
"llmwiki": {
"command": "python3",
"args": ["-m", "llmwiki.mcp"]
}
}
}
```

## Configuration

Single JSON config at `examples/sessions_config.json`. Copy to `config.json` and edit:

```json
{
"filters": {
"live_session_minutes": 60,
"exclude_projects": []
},
"redaction": {
"real_username": "YOUR_USERNAME",
"replacement_username": "USER",
"extra_patterns": [
"(?i)(api[_-]?key|secret|token|bearer|password)...",
"sk-[A-Za-z0-9]{20,}"
]
},
"truncation": {
"tool_result_chars": 500,
"bash_stdout_lines": 5
},
"adapters": {
"obsidian": {
"vault_paths": ["~/Documents/Obsidian Vault"]
}
}
}
```

All paths, regexes, truncation limits, and per-adapter settings are tunable. See [docs/configuration.md](docs/configuration.md).

## `.llmwikiignore`

Gitignore-style pattern file at the repo root. Skip entire projects, dates, or specific sessions without touching config:

```
# Skip a whole project
confidential-client/
# Skip anything before a date
*2025-*
# Keep exception
!confidential-client/public-*
```

## Karpathy's LLM Wiki pattern

This project follows the three-layer structure described in [Karpathy's gist](https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f):

1. **Raw sources** (`raw/`) β€” immutable. Session transcripts converted from `.jsonl`.
2. **The wiki** (`wiki/`) β€” LLM-generated. One page per entity, concept, source. Interlinked via `[[wikilinks]]`.
3. **The schema** (`CLAUDE.md`, `AGENTS.md`) β€” tells your agent how to ingest and query.

See [docs/architecture.md](docs/architecture.md) for the full breakdown and how it maps to the file tree.

## Design principles

- **Stdlib first** β€” only mandatory runtime dep is `markdown`. `pypdf` is an optional extra for PDF ingestion.
- **Works offline** β€” no Google fonts, no external CSS. Syntax highlighting loads from a highlight.js CDN but degrades gracefully without it.
- **Redact by default** β€” username, API keys, tokens, emails all get redacted before entering the wiki.
- **Idempotent everything** β€” re-running any command is safe and cheap.
- **Agent-agnostic core** β€” the converter doesn't know which agent produced the `.jsonl`; adapters translate.
- **Privacy by default** β€” localhost-only binding, no telemetry, no cloud calls.
- **Dual-format output (v0.4)** β€” every page ships both for humans (HTML) and AI agents (TXT + JSON + JSON-LD + sitemap + llms.txt).

## Docs

- [Getting started](docs/getting-started.md) β€” 5-minute quickstart
- **[Setup guide](docs/tutorials/setup-guide.md)** β€” 15-minute end-to-end tutorial: local setup β†’ deploy to GitHub Pages β†’ customization (v1.0)
- [Obsidian integration](docs/obsidian-integration.md) β€” 5-minute setup, 6 recommended plugins, config recipes (v1.0)
- [Architecture](docs/architecture.md) β€” Karpathy 3-layer + 8-layer build breakdown
- [Configuration](docs/configuration.md) β€” every tuning knob
- [Privacy](docs/privacy.md) β€” redaction rules + `.llmwikiignore` + localhost binding
- [Windows setup](docs/windows-setup.md) β€” Windows-specific gotchas
- [Framework](docs/framework.md) β€” Open Source Framework v4.1 adapted for agent-native dev tools
- [Research](docs/research.md) β€” Phase 1.25 analysis of 15 prior LLM Wiki implementations
- [Feature matrix](docs/feature-matrix.md) β€” all 161 features across 16 categories
- [Roadmap](docs/roadmap.md) β€” Phase Γ— Layer Γ— Item MoSCoW table
- **Translations**: [i18n/zh-CN](docs/i18n/zh-CN/), [i18n/ja](docs/i18n/ja/), [i18n/es](docs/i18n/es/)

Per-adapter docs:
- [Claude Code adapter](docs/adapters/claude-code.md)
- [Codex CLI adapter](docs/adapters/codex-cli.md)
- [Cursor adapter](docs/adapters/cursor.md)
- [Gemini CLI adapter](docs/adapters/gemini-cli.md)
- [Obsidian adapter](docs/adapters/obsidian.md)
- [Copilot adapter (Chat + CLI)](docs/adapters/copilot.md)

## Releases

| Version | Focus | Tag |
|---|---|---|
| [v0.1.0](https://github.com/Pratiyush/llm-wiki/releases/tag/v0.1.0) | Core release β€” Claude Code adapter, god-level HTML UI, schema, CI, plugin scaffolding | `v0.1.0` |
| [v0.2.0](https://github.com/Pratiyush/llm-wiki/releases/tag/v0.2.0) | Extensions β€” 3 new slash commands, 3 new adapters, Obsidian bidirectional, full MCP server | `v0.2.0` |
| [v0.3.0](https://github.com/Pratiyush/llm-wiki/releases/tag/v0.3.0) | PyPI packaging, eval framework, i18n scaffold | `v0.3.0` |
| [v0.4.0](https://github.com/Pratiyush/llm-wiki/releases/tag/v0.4.0) | AI + human dual format β€” per-page .txt/.json siblings, llms.txt, JSON-LD graph, sitemap, RSS, schema.org microdata, reading time, related pages, activity heatmap, deep-link anchors, build manifest, link checker, `wiki_export` MCP tool | `v0.4.0` |
| v0.5.0 – v0.9.0 | Internal sprint milestones β€” features (`_context.md`, auto-ingest, qmd export, model-profile schema, activity heatmap, Copilot adapters, etc.) shipped consolidated under the v0.9.x line. No standalone tags were published. | β€” |
| [v0.9.1](https://github.com/Pratiyush/llm-wiki/releases/tag/v0.9.1) | Sprint 1 & 2 foundation β€” link-obsidian CLI, 4-factor confidence scoring, 5-state lifecycle machine, llmbook-reference skill, 7 entity types, flat raw/ naming, pending ingest queue, `_context.md` stubs, meeting + Jira adapters, configurable Web Clipper intake, rich log format | `v0.9.1` |
| [v0.9.2](https://github.com/Pratiyush/llm-wiki/releases/tag/v0.9.2) | Sprint 3 quality β€” 11 lint rules (8 basic + 3 LLM-powered), Auto Dream MEMORY.md consolidation, Dataview dashboard template, category pages (Dataview + static), auto-build on sync + configurable lint schedule | `v0.9.2` |
| [v0.9.3](https://github.com/Pratiyush/llm-wiki/releases/tag/v0.9.3) | Sprint 3 polish β€” Obsidian Templater templates, integration guide, two-way editing tests, MCP server 7β†’12 tools, adapter config validation, pipeline fix (sigstore, PyPI gate) | `v0.9.3` |
| [v0.9.4](https://github.com/Pratiyush/llm-wiki/releases/tag/v0.9.4) | Session C1 (Sprint 4) β€” multi-agent skill installer, enhanced search with facets, configurable scheduled sync (launchd/systemd/Task Scheduler), CI wiki-checks workflow | `v0.9.4` |
| [v0.9.5](https://github.com/Pratiyush/llm-wiki/releases/tag/v0.9.5) | Docs polish + consistency audit before v1.0.0 | `v0.9.5` |
| [v1.0.0](https://github.com/Pratiyush/llm-wiki/releases/tag/v1.0.0) | Production-ready Obsidian integration β€” full v1.0 scope | `v1.0.0` |
| [v1.1.0-rc1](https://github.com/Pratiyush/llm-wiki/releases/tag/v1.1.0-rc1) | Solo quick-win sprint β€” candidates workflow, Ollama scaffold, prompt-cache scaffold | `v1.1.0-rc1` |
| [v1.1.0-rc2](https://github.com/Pratiyush/llm-wiki/releases/tag/v1.1.0-rc2) | Session E β€” interactive graph viewer + remaining code-only v1.1 work | `v1.1.0-rc2` |
| [v1.1.0-rc3](https://github.com/Pratiyush/llm-wiki/releases/tag/v1.1.0-rc3) | Gap-sweep bundle β€” state portability, quarantine, sync --status, log CLI, synthesize --estimate breakdown, tag family, stale references, graph context menu, raw immutability, AI-sessions default | `v1.1.0-rc3` |
| [v1.1.0-rc4](https://github.com/Pratiyush/llm-wiki/releases/tag/v1.1.0-rc4) | Navigation + quality β€” graph `site_url` resolver (99.7% β†’ 0% dead clicks), `llmwiki backlinks` CLI (95% β†’ 0% orphan pages), source-code β†’ GitHub link rewriter (471 β†’ 100 broken), verify-before-fixing contribution rule | `v1.1.0-rc4` |
| [v1.1.0-rc5](https://github.com/Pratiyush/llm-wiki/releases/tag/v1.1.0-rc5) | Site audit + 5 closed batches β€” session-local ref stripping (351 β†’ 247 broken), cheatsheet, README/CONTRIBUTING compile, expanded E2E, slash-CLI parity test, 4 adapter docs, Ollama tutorial, dual-mode docs skeleton, `/wiki-synthesize` slash | `v1.1.0-rc5` |
| [v1.1.0-rc6](https://github.com/Pratiyush/llm-wiki/releases/tag/v1.1.0-rc6) | rc6 batch β€” fixed adapter tag hardcoded to `claude-code` for every adapter (#346), tutorial UX polish with in-page TOC + prev/next + edit-on-GitHub (#282), command palette now indexes 107 doc pages + 17 slash commands (#277), content-hash cache for `md_to_html` (#283) | `v1.1.0-rc6` |
| [v1.1.0-rc7](https://github.com/Pratiyush/llm-wiki/releases/tag/v1.1.0-rc7) | rc7 batch β€” automatic AI-suggested tags during synthesis (#351), link-checker config fix (#348, #350, #353) | `v1.1.0-rc7` |
| [v1.1.0-rc8](https://github.com/Pratiyush/llm-wiki/releases/tag/v1.1.0-rc8) | rc8 batch β€” complete Mode B agent-delegate backend (#316): new `llmwiki synthesize --list-pending` + `--complete ` CLI subcommands, `/wiki-sync` step 6 auto-detects pending prompts, Mode B ships end-to-end without an API key | `v1.1.0-rc8` |
| [**v1.2.0**](https://github.com/Pratiyush/llm-wiki/releases/tag/v1.2.0) | **First stable on the 1.x line** β€” `llmwiki all` one-shot pipeline runner, Playwright + axe-core E2E suite (#384), project-stub auto-seeding, 2 new lint rules, critical export-fidelity + sync-collision fixes, 10 UX-critique items (#387). PyPI distribution name: `llm-notebook`. | `v1.2.0` |

## Roadmap

Shipped milestones:

- **v0.5.0** β€” Folder-level `_context.md`, auto-ingest, adapter graduations, lazy search index, scheduled sync, WCAG, E2E tests ([milestone](https://github.com/Pratiyush/llm-wiki/milestone/4))
- **v0.6.0** β€” qmd export, GitLab Pages CI, PyPI release automation, maintainer governance scaffold ([milestone](https://github.com/Pratiyush/llm-wiki/milestone/5))
- **v0.7.0** β€” Structured model-profile schema, vs-comparison pages, append-only changelog timeline ([milestone](https://github.com/Pratiyush/llm-wiki/milestone/7))
- **v0.8.0** β€” 365-day activity heatmap, tool-calling bar chart, token usage card, session metrics frontmatter ([milestone](https://github.com/Pratiyush/llm-wiki/milestone/8))
- **v0.9.0** β€” Project topics, agent labels, Copilot adapters, image pipeline, highlight.js, public demo deployment
- **v0.9.x** β€” Sprint 1-4 foundation for v1.0.0 Obsidian integration: confidence scoring, lifecycle state machine, 9 navigation files, 11 lint rules, Auto Dream, Dataview dashboard, multi-agent skills, 12-tool MCP server, meeting + Jira adapters

Active milestones:

| Milestone | Focus | Tracking |
|---|---|---|
| **v1.0.0** | Final docs polish + PyPI trusted publisher + release | [Milestone](https://github.com/Pratiyush/llm-wiki/milestone/9) |
| **v1.1.0** | Ollama backend, prompt caching, interactive graph viewer, Homebrew tap | [Milestone](https://github.com/Pratiyush/llm-wiki/milestone/10) |
| **v1.2.0** | ChatGPT + OpenCode adapters, vault-overlay mode, tree-aware search, cache tiers | [Milestone](https://github.com/Pratiyush/llm-wiki/milestone/11) |

### Deployment targets

- **GitHub Pages** β€” shipped in v0.1 via `.github/workflows/pages.yml` (triggers on push to master). See [`docs/deploy/github-pages.md`](docs/deploy/github-pages.md).
- **Docker / GHCR** β€” pull and run: `docker compose pull && docker compose up -d`. Image published to `ghcr.io/pratiyush/llm-wiki` on every tag push. See [`docs/deploy/docker.md`](docs/deploy/docker.md).
- **GitLab Pages** β€” copy [`.gitlab-ci.yml.example`](.gitlab-ci.yml.example) β†’ `.gitlab-ci.yml`. See [`docs/deploy/gitlab-pages.md`](docs/deploy/gitlab-pages.md).
- **Vercel / Netlify** β€” static deploy after `llmwiki build`. See [`docs/deploy/vercel-netlify.md`](docs/deploy/vercel-netlify.md).
- **Any static host** β€” `llmwiki build` writes to `site/`, which you can `rsync`/`scp` anywhere.

## Acknowledgements

- [Andrej Karpathy](https://twitter.com/karpathy) for [the LLM Wiki idea](https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f)
- [SamurAIGPT/llm-wiki-agent](https://github.com/SamurAIGPT/llm-wiki-agent), [lucasastorian/llmwiki](https://github.com/lucasastorian/llmwiki), [xoai/sage-wiki](https://github.com/xoai/sage-wiki), and [bashiraziz/llm-wiki-template](https://github.com/bashiraziz/llm-wiki-template) β€” prior art that shaped this.
- [Python Markdown](https://python-markdown.github.io/) for the rendering pipeline, and [highlight.js](https://highlightjs.org/) for client-side syntax highlighting.
- [llmstxt.org](https://llmstxt.org) for the llms.txt spec used in v0.4.

## License

[MIT](LICENSE) Β© Pratiyush