{"id":49318633,"url":"https://github.com/synpulse8-opensource/pulse8-ai-cortex-knowledge-vault","last_synced_at":"2026-05-27T22:01:01.131Z","repository":{"id":353948037,"uuid":"1219657136","full_name":"synpulse8-opensource/pulse8-ai-cortex-knowledge-vault","owner":"synpulse8-opensource","description":"Agent-native knowledge OS built on Markdown. A shared vault for AI agents and humans, backed by a typed knowledge graph, full-text search, and an LLM-powered compiler, all accessible through MCP. Drop files in, let agents read, write, search, link, and compile knowledge. No database required.","archived":false,"fork":false,"pushed_at":"2026-05-04T21:27:38.000Z","size":575,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-04T21:35:01.112Z","etag":null,"topics":["ai-agents","docker","fastapi","karpathy-inspired","karpathy-llm-wiki","knowledge-base","knowledge-graph","knowledge-management","llm","llm-wiki","markdown","mcp","python","qmd"],"latest_commit_sha":null,"homepage":"https://www.synpulse8.com/en/our-solutions/pulse8#pulse8-ai","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/synpulse8-opensource.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-24T05:06:17.000Z","updated_at":"2026-05-04T21:27:42.000Z","dependencies_parsed_at":null,"dependency_job_id":"0f5d16a7-f6d4-4f6e-ac12-8b241d79fbe3","html_url":"https://github.com/synpulse8-opensource/pulse8-ai-cortex-knowledge-vault","commit_stats":null,"previous_names":["synpulse8-opensource/pulse8-ai-cortex-knowledge-vault"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/synpulse8-opensource/pulse8-ai-cortex-knowledge-vault","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/synpulse8-opensource%2Fpulse8-ai-cortex-knowledge-vault","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/synpulse8-opensource%2Fpulse8-ai-cortex-knowledge-vault/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/synpulse8-opensource%2Fpulse8-ai-cortex-knowledge-vault/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/synpulse8-opensource%2Fpulse8-ai-cortex-knowledge-vault/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/synpulse8-opensource","download_url":"https://codeload.github.com/synpulse8-opensource/pulse8-ai-cortex-knowledge-vault/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/synpulse8-opensource%2Fpulse8-ai-cortex-knowledge-vault/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33585203,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-27T02:00:06.184Z","response_time":53,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","docker","fastapi","karpathy-inspired","karpathy-llm-wiki","knowledge-base","knowledge-graph","knowledge-management","llm","llm-wiki","markdown","mcp","python","qmd"],"created_at":"2026-04-26T17:00:18.644Z","updated_at":"2026-05-27T22:01:01.124Z","avatar_url":"https://github.com/synpulse8-opensource.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/pulse8-banner.png\" alt=\"PULSE8.ai\" width=\"600\" /\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003ePULSE8.ai Cortex\u003c/h1\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cstrong\u003eAgent-native knowledge OS built on Markdown\u003c/strong\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/synpulse8-opensource/pulse8-ai-cortex-knowledge-vault/actions/workflows/pylint.yml\"\u003e\u003cimg src=\"https://github.com/synpulse8-opensource/pulse8-ai-cortex-knowledge-vault/actions/workflows/pylint.yml/badge.svg\" alt=\"Build\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/synpulse8-opensource/pulse8-ai-cortex-knowledge-vault/releases/latest\"\u003e\u003cimg src=\"https://img.shields.io/github/v/release/synpulse8-opensource/pulse8-ai-cortex-knowledge-vault\" alt=\"Release\"\u003e\u003c/a\u003e\n  \u003ca href=\"LICENSE.md\"\u003e\u003cimg src=\"https://img.shields.io/badge/license-Apache%202.0-blue\" alt=\"License\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/python-3.12+-3776AB?logo=python\u0026logoColor=white\" alt=\"Python\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/FastAPI-009688?logo=fastapi\u0026logoColor=white\" alt=\"FastAPI\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/MCP-Model%20Context%20Protocol-blueviolet\" alt=\"MCP\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Docker-2496ED?logo=docker\u0026logoColor=white\" alt=\"Docker\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/NetworkX-graph%20engine-orange\" alt=\"NetworkX\"\u003e\n\u003c/p\u003e\n\nPULSE8.ai Cortex is an agent-native knowledge OS built on Markdown. It gives AI agents and humans a shared vault backed by a typed knowledge graph, full-text search, and a [MarkItDown](https://github.com/microsoft/markitdown)-powered compiler — all accessible through a unified [MCP](https://modelcontextprotocol.io/) interface.\n\nDrop files in (PDF, DOCX, PPTX, XLSX, HTML, images, and more), let agents read, write, search, link, and compile knowledge — no database required.\n\n\u003e Inspired by [Andrej Karpathy](https://github.com/karpathy)'s [LLM Wiki](https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f) pattern — a persistent, compounding knowledge base maintained by LLMs instead of re-derived on every query. Search powered by [Tobi Lütke](https://github.com/tobi)'s [QMD](https://github.com/tobi/qmd).\n\n---\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003ch2\u003eGet started\u003c/h2\u003e\u003c/summary\u003e\n\n\u003e [!NOTE]\n\u003e PULSE8.ai Cortex requires Docker. An [OpenRouter API key](https://openrouter.ai/keys) is optional — needed only for LLM-powered cross-referencing between wiki articles. File conversion works out of the box without any API key.\n\n1. Clone the repository:\n  ```bash\n    git clone https://github.com/pulse8-ai/cortex-knowledge-vault.git\n    cd cortex-knowledge-vault\n  ```\n2. Launch PULSE8.ai Cortex:\n  ```bash\n    ./scripts/start.sh\n  ```\n    This builds and starts both **PULSE8.ai Cortex** (API + MCP on `:8420`) and **QMD** (search on `:3100`), waits for health checks, and you're ready to go.\n3. Connect your MCP client (e.g. Claude Desktop) to `http://localhost:8420/mcp/`.\n\nTo stop: `./scripts/stop.sh`\n\n### Cortex-only mode (macOS / native QMD)\n\nIf you want QMD to run natively (e.g. on macOS with Metal GPU acceleration), start only the Cortex container:\n\n```bash\n# Terminal 1: Run QMD natively\nnpm install -g @tobilu/qmd\nVAULT_PATH=./example_vault node docker/qmd/server.mjs\n\n# Terminal 2: Start only Cortex in Docker\n./scripts/start.sh --cortex-only\n```\n\nTo stop: `./scripts/stop.sh --cortex-only`\n\n### GPU-accelerated QMD (EC2 / Linux with NVIDIA GPU)\n\nFor production deployments with NVIDIA GPU acceleration:\n\n```bash\ndocker compose -f docker-compose.yml -f docker-compose.gpu.yml up --build -d\n```\n\nSee [docs/ec2-gpu-setup.md](docs/ec2-gpu-setup.md) for a full guide on instance selection, NVIDIA toolkit installation, and cost estimates.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003ch2\u003eFeatures\u003c/h2\u003e\u003c/summary\u003e\n\n|                      |                                                                                                         |\n| -------------------- | ------------------------------------------------------------------------------------------------------- |\n| **Knowledge Graph**  | Typed graph engine (NetworkX) — wikilinks, tags, and custom edges, auto-maintained on every file change                  |\n| **Full-Text Search** | BM25 keyword search via QMD with optional hybrid (vector + reranking) mode                                               |\n| **File Compiler**    | Converts raw sources (PDF, DOCX, PPTX, XLSX, HTML, images, etc.) to Markdown via [MarkItDown](https://github.com/microsoft/markitdown). LLM used only for cross-referencing. |\n| **MCP Server**       | Streamable HTTP + stdio transport — works with Claude Desktop, Cursor, and any MCP client                                |\n| **Bulk Ingest**      | Ingest dozens or hundreds of files at once from a local directory with SHA-256 dedup and bounded concurrency              |\n| **REST API**         | FastAPI endpoints mirroring all MCP tools at `/api/v1/`, including multipart file upload and bulk ingest                 |\n| **Vault Watcher**    | Real-time filesystem monitoring — graph stays in sync automatically                                                      |\n| **Zero Database**    | Everything persists as Markdown + JSON on your filesystem                                                                |\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003ch2\u003eMCP tools\u003c/h2\u003e\u003c/summary\u003e\n\n| Tool            | Description                                                        |\n| --------------- | ------------------------------------------------------------------ |\n| `vault_read`    | Read a note by path                                                |\n| `vault_write`   | Create or update a note                                            |\n| `vault_search`  | Search the vault (keyword / semantic / hybrid)                     |\n| `vault_link`    | Create, query, or delete graph edges                               |\n| `vault_context` | Build a context window: search → graph traversal → ranked subgraph |\n| `vault_ingest`  | Ingest raw content or binary files (supports `content_base64` for binary) |\n| `vault_compile` | Compile unprocessed raw sources into wiki Markdown via MarkItDown         |\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003ch2\u003eArchitecture\u003c/h2\u003e\u003c/summary\u003e\n\n```\n┌──────────────────────────────────────────────┐\n│  MCP Client (Claude Desktop, Cursor, etc.)   │\n└──────────┬───────────────────────────────────┘\n           │  MCP (HTTP or stdio)\n┌──────────▼───────────────────────────────────┐\n│  PULSE8.ai Cortex  :8420                     │\n│  ┌──────────────────────────────────────┐     │\n│  │ Auth (API Key or Microsoft Entra ID) │     │\n│  └──────────────┬───────────────────────┘     │\n│  ┌─────────┐ ┌──┴───────┐ ┌──────────────┐   │\n│  │ MCP     │ │ REST API │ │ Vault Watcher│   │\n│  │ /mcp/   │ │ /api/v1/ │ │ (watchfiles) │   │\n│  └────┬────┘ └────┬─────┘ └──────┬───────┘   │\n│       └───────────┼──────────────┘           │\n│            ┌──────▼──────┐                   │\n│            │ Graph Engine│                   │\n│            │ + Compiler  │                   │\n│            └─────────────┘                   │\n└──────────┬───────────────────────────────────┘\n           │\n┌──────────▼───────────────────────────────────┐\n│  QMD  :3100                                  │\n│  BM25 + vector search, auto-indexes on start │\n└──────────┬───────────────────────────────────┘\n           │\n┌──────────▼───────────────────────────────────┐\n│  Vault (bind-mounted volume)                 │\n│  wiki/ raw/ agents/ sessions/ daily/         │\n│  .cortex/ (graph.json, index.md, log.md)     │\n└──────────────────────────────────────────────┘\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003ch2\u003eBulk ingest\u003c/h2\u003e\u003c/summary\u003e\n\nFor ingesting many files at once (dozens or hundreds of PDFs, papers, docs), use the one-click shell script instead of feeding them one at a time through MCP. It reads directly from a local directory — no wire overhead, no running server required — deduplicates via SHA-256 hashing, compiles with bounded concurrency, and rebuilds the index once at the end.\n\n### One-click script (recommended)\n\n```bash\n# Ingest all files from a directory\n./scripts/bulk_ingest.sh ./my-papers/\n\n# Dry-run to preview what would be ingested\n./scripts/bulk_ingest.sh ./my-papers/ --dry-run\n\n# Force re-ingest (bypass dedup manifest)\n./scripts/bulk_ingest.sh ./my-papers/ --force\n\n# Control LLM concurrency (default: 4)\n./scripts/bulk_ingest.sh ./my-papers/ --concurrency 8\n```\n\nThe script automatically loads your `.env` for the LLM key and vault path, prints a summary, then runs the full pipeline (copy, compile, reindex). No running Cortex server needed.\n\n### Python CLI (direct)\n\n```bash\nCORTEX_VAULT_PATH=./example_vault uv run cortex-bulk-ingest --source ./my-papers/\n```\n\n### Inside Docker\n\n```bash\n# Set INGEST_DIR in .env or export it, then restart\nexport INGEST_DIR=/path/to/your/papers\ndocker compose up -d\n\n# Run bulk ingest inside the container\ndocker exec pulse8-ai-cortex uv run cortex-bulk-ingest --source /ingest\n```\n\n### Via REST API\n\nFor programmatic use without MCP (requires running Cortex server):\n\n```bash\ncurl -X POST http://localhost:8420/api/v1/bulk-ingest \\\n  -H \"Content-Type: application/json\" \\\n  -H \"x-api-key: your-secret-api-key\" \\\n  -d '{\"source_dir\": \"/ingest\", \"concurrency\": 4}'\n```\n\n### Deduplication\n\nThe dedup manifest is stored at `.cortex/ingest-manifest.json`. Files are matched by content hash, not filename — renaming a file won't cause re-ingestion, and the same content under a different name will be skipped.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003ch2\u003eConfiguration\u003c/h2\u003e\u003c/summary\u003e\n\nCopy the example and fill in your values:\n\n```bash\ncp .env.example .env\n```\n\n| Variable                       | Required | Default                        | Description                                          |\n| ------------------------------ | -------- | ------------------------------ | ---------------------------------------------------- |\n| `LLM_API_KEY`                  | No       | —                              | OpenRouter (or compatible) API key (for cross-referencing only) |\n| `COMPILER_MODEL`               | No       | `anthropic/claude-sonnet-4`    | Model for cross-reference detection                             |\n| `LLM_BASE_URL`                 | No       | `https://openrouter.ai/api/v1` | LLM API base URL                                                |\n| `VAULT_DIR`                    | No       | `./example_vault`              | Path to your vault directory                         |\n| `INGEST_DIR`                   | No       | `./ingest`                     | Path to bulk-ingest source directory (mounted as `/ingest` in Docker) |\n| `QMD_REFRESH_INTERVAL_SECONDS` | No       | `900`                          | Periodic re-index interval (seconds; `0` to disable) |\n| `QMD_EMBED_TIMEOUT_MS`         | No       | `600000`                       | Embed timeout in ms (increase for CPU-only deployments) |\n| `QMD_URL`                      | No       | —                              | External QMD URL for cortex-only mode (e.g. `http://host.docker.internal:3100`) |\n| `AUTH_METHOD`                  | No       | `none`                         | Authentication method: `none`, `apikey`, or `oidc` (see [Authentication](#authentication)) |\n| `API_KEY`                      | No       | —                              | Static API key for `x-api-key` header (used when `AUTH_METHOD=apikey`) |\n| `OIDC_TENANT_ID`               | No       | —                              | Microsoft Entra ID tenant ID (used when `AUTH_METHOD=oidc`) |\n| `OIDC_CLIENT_ID`               | No       | —                              | Microsoft Entra ID app (client) ID |\n| `OIDC_CLIENT_SECRET`           | No       | —                              | Microsoft Entra ID client secret |\n| `OIDC_BASE_URL`                | No       | `http://localhost:8420`        | Public base URL of the Cortex server (used for OAuth callbacks) |\n\n`OPENROUTER_API_KEY` and `CORTEX_LLM_API_KEY` are accepted as aliases for `LLM_API_KEY`.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003ch2\u003eAuthentication\u003c/h2\u003e\u003c/summary\u003e\n\nCortex supports two authentication methods that protect both the REST API (`/api/v1/`) and the MCP endpoint (`/mcp/`). Set `AUTH_METHOD` in `.env` to choose:\n\n| `AUTH_METHOD` | Description |\n| ------------- | ----------- |\n| `none`        | Default. All endpoints are open — no authentication required. |\n| `apikey`      | Static API key. Clients pass `x-api-key` header. |\n| `oidc`        | Microsoft Entra ID (Azure AD) with OAuth 2.0 + MFA support. |\n\n### API Key (`AUTH_METHOD=apikey`)\n\nThe simplest option. Set the method and key in `.env`:\n\n```\nAUTH_METHOD=apikey\nAPI_KEY=your-secret-api-key\n```\n\nClients pass it via the `x-api-key` header:\n\n```bash\n# REST API\ncurl http://localhost:8420/api/v1/health \\\n  -H \"x-api-key: your-secret-api-key\"\n\n# MCP (via curl)\ncurl -X POST http://localhost:8420/mcp/ \\\n  -H \"x-api-key: your-secret-api-key\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"initialize\",\"params\":{...}}'\n```\n\nNo OAuth discovery endpoints are served — no login popups. Requests without a valid key receive a `401`.\n\n### Microsoft Entra ID (`AUTH_METHOD=oidc`)\n\nFor enterprise environments that require interactive login with MFA support:\n\n```\nAUTH_METHOD=oidc\nOIDC_TENANT_ID=your-tenant-id\nOIDC_CLIENT_ID=your-client-id\nOIDC_CLIENT_SECRET=your-client-secret\nOIDC_BASE_URL=http://localhost:8420\n```\n\nThis enables:\n- **REST API**: OAuth 2.0 Authorization Code Flow via `GET /api/v1/login`. After login, pass the access token as `Authorization: Bearer \u003ctoken\u003e`. A valid `x-api-key` header is also accepted as a fallback when `API_KEY` is set.\n- **MCP endpoint**: FastMCP's built-in OIDCProxy handles interactive browser-based login.\n\n### Azure AD app registration\n\nTo use OIDC, register an app in the [Azure Portal](https://portal.azure.com):\n\n1. Go to **Azure Active Directory → App registrations → New registration**\n2. Set the redirect URI to `http://localhost:8420/api/v1/auth/callback` (Web platform)\n3. Under **Certificates \u0026 secrets**, create a client secret\n4. Under **API permissions**, add `openid`, `profile`, and `email` (Microsoft Graph → Delegated)\n5. Copy the Tenant ID, Client ID, and Client Secret into `.env`\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003ch2\u003eMCP client setup\u003c/h2\u003e\u003c/summary\u003e\n\n### Claude Desktop\n\nAn example config is included at `[claude_desktop_config.example.json](claude_desktop_config.example.json)`.\n\n**HTTP with API key (recommended)** — PULSE8.ai Cortex runs as a persistent server:\n\n```json\n{\n  \"mcpServers\": {\n    \"cortex\": {\n      \"url\": \"http://localhost:8420/mcp/\",\n      \"headers\": {\n        \"x-api-key\": \"your-secret-api-key\"\n      }\n    }\n  }\n}\n```\n\n**HTTP without auth** — when no authentication is configured:\n\n```json\n{\n  \"mcpServers\": {\n    \"cortex\": {\n      \"url\": \"http://localhost:8420/mcp/\"\n    }\n  }\n}\n```\n\n**Stdio** — Claude Desktop launches the server on demand (no auth needed):\n\n```json\n{\n  \"mcpServers\": {\n    \"cortex\": {\n      \"command\": \"uv\",\n      \"args\": [\"run\", \"--project\", \"/path/to/cortex\", \"python\", \"-m\", \"cortex.mcp\"],\n      \"env\": {\n        \"CORTEX_VAULT_PATH\": \"/path/to/your/vault\"\n      }\n    }\n  }\n}\n```\n\n### Cursor\n\nAdd to your `.cursor/mcp.json`:\n\n```json\n{\n  \"mcpServers\": {\n    \"cortex\": {\n      \"url\": \"http://localhost:8420/mcp/\",\n      \"headers\": {\n        \"x-api-key\": \"your-secret-api-key\"\n      }\n    }\n  }\n}\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003ch2\u003eHow it works\u003c/h2\u003e\u003c/summary\u003e\n\n**Watcher** and **Compiler** are independent components:\n\n- The **Watcher** maintains the graph. Any `.md` file added, modified, or deleted triggers automatic node/edge updates.\n- The **Compiler** converts raw source files to Markdown using [MarkItDown](https://github.com/microsoft/markitdown) and writes them to `wiki/`. Supported formats include PDF, DOCX, PPTX, XLSX, HTML, CSV, JSON, XML, images (EXIF/OCR), and plain text. The LLM is only used for optional cross-reference detection between articles.\n\nThey connect indirectly: the compiler writes to `wiki/`, the watcher picks those up and updates the graph.\n\n### Supported file formats\n\n| Format | Extensions |\n| ------ | ---------- |\n| PDF | `.pdf` |\n| Microsoft Word | `.docx` |\n| Microsoft PowerPoint | `.pptx` |\n| Microsoft Excel | `.xlsx`, `.xls` |\n| HTML | `.html`, `.htm` |\n| Text-based | `.csv`, `.json`, `.xml`, `.txt`, `.md` |\n| Images | `.jpg`, `.png`, etc. (EXIF metadata) |\n\n**Search** uses a two-stage pipeline:\n\n1. **QMD** performs keyword/semantic search on file contents\n2. **PULSE8.ai Cortex** enriches results with graph edges (wikilinks, tags, relationships between matched notes)\n\nQMD answers *\"what's relevant?\"* — the graph answers *\"how are these results connected?\"*\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003ch2\u003eDevelopment\u003c/h2\u003e\u003c/summary\u003e\n\n```bash\n# Install dependencies\nuv sync --all-extras\n\n# Run tests\nuv run pytest tests/ -v\n\n# Run shell tests (requires bats-core)\nbats tests/test_start_sh.bats\n\n# Start PULSE8.ai Cortex locally (without Docker)\nCORTEX_MCP_TRANSPORT=http CORTEX_VAULT_PATH=./example_vault uv run python scripts/serve.py\n```\n\n### Utility scripts\n\n| Script                    | Description                                                    |\n| ------------------------- | -------------------------------------------------------------- |\n| `scripts/serve.py`        | Dev server (HTTP or stdio based on `CORTEX_MCP_TRANSPORT`)     |\n| `scripts/compile.py`      | Batch-compile all raw sources                                  |\n| `scripts/reindex.py`      | Full reindex + graph rebuild                                   |\n| `scripts/bulk_ingest.sh`  | One-click bulk ingest from a local directory                   |\n| `scripts/bulk_ingest.py`  | Python CLI for bulk ingest (called by `bulk_ingest.sh`)        |\n| `scripts/lint.py`         | Lint vault structure                                           |\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003ch2\u003eData persistence\u003c/h2\u003e\u003c/summary\u003e\n\nThe vault directory is bind-mounted from your host into the containers. All data lives on your local disk and survives container restarts.\n\nThe QMD search index is stored in a Docker volume (`qmd-cache`). To force a full re-index:\n\n```bash\ndocker compose down -v\n./scripts/start.sh\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003ch2\u003eContributing\u003c/h2\u003e\u003c/summary\u003e\n\nWe welcome contributions! Please open an issue to discuss your idea before submitting a pull request.\n\n```bash\n# Fork and clone the repo\ngit clone https://github.com/\u003cyour-username\u003e/cortex-knowledge-vault.git\ncd cortex-knowledge-vault\n\n# Create a branch\ngit checkout -b feat/my-feature\n\n# Install dev dependencies\nuv sync --all-extras\n\n# Make changes, then run tests\nuv run pytest tests/ -v\n\n# Submit a pull request\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003ch2\u003eReporting issues\u003c/h2\u003e\u003c/summary\u003e\n\nUse [GitHub Issues](https://github.com/pulse8-ai/cortex-knowledge-vault/issues) to report bugs or request features.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003ch2\u003eAcknowledgements\u003c/h2\u003e\u003c/summary\u003e\n\nPULSE8.ai Cortex builds on ideas and tools from the open-source community:\n\n- **[LLM Wiki](https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f)** by [Andrej Karpathy](https://github.com/karpathy) — the core pattern of an LLM-maintained, persistent knowledge base that compiles and interlinks knowledge incrementally rather than re-discovering it from raw documents on every query. This gist is the direct inspiration for Cortex's architecture.\n- **[QMD](https://github.com/tobi/qmd)** by [Tobi Lütke](https://github.com/tobi) — the on-device search engine powering all full-text and hybrid search in Cortex. QMD combines BM25, vector search, and LLM re-ranking, all running locally.\n- **[MarkItDown](https://github.com/microsoft/markitdown)** by [Microsoft](https://github.com/microsoft) — the file-to-Markdown converter powering the Cortex compiler. Converts PDF, Office documents, HTML, images, and more into structured Markdown for ingestion into the vault.\n\n\u003c/details\u003e\n\n## License\n\nThis project is licensed under the [PULSE8.ai Cortex Open Source License](LICENSE.md) (Apache License 2.0 with additional terms).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsynpulse8-opensource%2Fpulse8-ai-cortex-knowledge-vault","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsynpulse8-opensource%2Fpulse8-ai-cortex-knowledge-vault","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsynpulse8-opensource%2Fpulse8-ai-cortex-knowledge-vault/lists"}