{"id":39567598,"url":"https://github.com/richardwhiteii/rlm","last_synced_at":"2026-01-23T12:54:04.227Z","repository":{"id":333076943,"uuid":"1136126088","full_name":"richardwhiteii/rlm","owner":"richardwhiteii","description":"Recursive Language Model patterns for Claude Code — handle massive contexts (10M+ tokens) by treating them as external variables","archived":false,"fork":false,"pushed_at":"2026-01-19T02:53:38.000Z","size":11146,"stargazers_count":18,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-01-21T19:45:52.819Z","etag":null,"topics":["claude","claude-code","context-management","llm","mcp","recursive-language-model"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/richardwhiteii.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-01-17T05:32:58.000Z","updated_at":"2026-01-21T08:14:49.000Z","dependencies_parsed_at":"2026-01-20T09:00:29.992Z","dependency_job_id":null,"html_url":"https://github.com/richardwhiteii/rlm","commit_stats":null,"previous_names":["richardwhiteii/rlm"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/richardwhiteii/rlm","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/richardwhiteii%2Frlm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/richardwhiteii%2Frlm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/richardwhiteii%2Frlm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/richardwhiteii%2Frlm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/richardwhiteii","download_url":"https://codeload.github.com/richardwhiteii/rlm/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/richardwhiteii%2Frlm/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28661882,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-22T01:17:37.254Z","status":"online","status_checked_at":"2026-01-22T02:00:07.137Z","response_time":144,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["claude","claude-code","context-management","llm","mcp","recursive-language-model"],"created_at":"2026-01-18T07:15:33.116Z","updated_at":"2026-01-22T11:01:44.306Z","avatar_url":"https://github.com/richardwhiteii.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# RLM MCP Server\n\nRecursive Language Model patterns for Claude Code — handle massive contexts (10M+ tokens) by treating them as external variables.\n\nBased on: https://arxiv.org/html/2512.24601v1\n\n## How Users Interact\n\nYou don't call RLM tools directly. You ask Claude to analyze large files, and Claude uses RLM behind the scenes.\n\n**Example:**\n- You say: \"Analyze this 2MB log file for errors\"\n- Claude uses RLM tools internally\n- You get: \"I found 3 error patterns: database timeouts (47), auth failures (23)...\"\n\n## Core Idea\n\nInstead of feeding massive contexts directly into the LLM:\n1. **Load** context as external variable (stays out of prompt)\n2. **Inspect** structure programmatically\n3. **Chunk** strategically (lines, chars, or paragraphs)\n4. **Sub-query** recursively on chunks\n5. **Aggregate** results for final synthesis\n\n## Quick Start\n\n### Installation\n\n```bash\ngit clone https://github.com/richardwhiteii/rlm.git\ncd rlm\nuv sync\n```\n\nOr with pip:\n```bash\npython -m venv .venv\nsource .venv/bin/activate\npip install -e .\n```\n\n### Configure for Claude Code\n\nFirst, find your installation path:\n```bash\ncd rlm \u0026\u0026 pwd\n# Example output: /Users/your_username/projects/rlm\n```\n\nAdd to `~/.claude/.mcp.json`, replacing the example paths with your own:\n\n```json\n{\n  \"mcpServers\": {\n    \"rlm\": {\n      \"command\": \"uv\",\n      \"args\": [\"run\", \"--directory\", \"/Users/your_username/projects/rlm\", \"python\", \"-m\", \"src.rlm_mcp_server\"],\n      \"env\": {\n        \"RLM_DATA_DIR\": \"/Users/your_username/.rlm-data\"\n      }\n    }\n  }\n}\n```\n\n\u003e **Note**: Replace `/Users/your_username/projects/rlm` with the output from `pwd` above. The `RLM_DATA_DIR` is where RLM stores contexts and results — you can use any directory you prefer.\n\n**Alternative** (if not using uv):\n```json\n{\n  \"mcpServers\": {\n    \"rlm\": {\n      \"command\": \"/Users/your_username/projects/rlm/.venv/bin/python\",\n      \"args\": [\"-m\", \"src.rlm_mcp_server\"],\n      \"cwd\": \"/Users/your_username/projects/rlm\",\n      \"env\": {\n        \"RLM_DATA_DIR\": \"/Users/your_username/.rlm-data\"\n      }\n    }\n  }\n}\n```\n\n## Enable Auto-Detection\n\nEnable Claude to use RLM tools automatically without manual invocation:\n\n**1. CLAUDE.md Integration**\nCopy `CLAUDE.md.example` content to your project's `CLAUDE.md` (or `~/.claude/CLAUDE.md` for global) to teach Claude when to reach for RLM tools automatically.\n\n**2. Hook Installation**\nCopy the `.claude/hooks/` directory to your project to auto-suggest RLM when reading files \u003e25KB:\n```bash\ncp -r .claude/hooks/ /Users/your_username/your-project/.claude/hooks/\n```\nThe hook provides guidance but doesn't block reads.\n\n**3. Skill Reference**\nCopy the `.claude/skills/` directory for comprehensive RLM guidance:\n```bash\ncp -r .claude/skills/ /Users/your_username/your-project/.claude/skills/\n```\n\nWith these in place, Claude will autonomously detect when to use RLM instead of reading large files directly into context.\n\n## Tools\n\n\u003e These tools are used by Claude internally when processing large contexts. You don't call them directly—you just ask Claude to analyze large files.\n\n| Tool | Purpose |\n|------|---------|\n| `rlm_auto_analyze` | **One-step analysis** — auto-detects type, chunks, and queries |\n| `rlm_load_context` | Load context as external variable |\n| `rlm_inspect_context` | Get structure info without loading into prompt |\n| `rlm_chunk_context` | Chunk by lines/chars/paragraphs |\n| `rlm_get_chunk` | Retrieve specific chunk |\n| `rlm_filter_context` | Filter with regex (keep/remove matching lines) |\n| `rlm_exec` | Execute Python code against loaded context (sandboxed) |\n| `rlm_sub_query` | Make sub-LLM call on chunk |\n| `rlm_sub_query_batch` | Process multiple chunks in parallel |\n| `rlm_store_result` | Store sub-call result for aggregation |\n| `rlm_get_results` | Retrieve stored results |\n| `rlm_list_contexts` | List all loaded contexts |\n\n### Quick Analysis with `rlm_auto_analyze`\n\nFor most use cases, Claude uses `rlm_auto_analyze` — it handles everything automatically:\n\n```python\nrlm_auto_analyze(\n    name=\"my_file\",\n    content=file_content,\n    goal=\"find_bugs\"  # or: summarize, extract_structure, security_audit, answer:\u003cquestion\u003e\n)\n```\n\n**What it does automatically:**\n1. Detects content type (Python, JSON, Markdown, logs, prose, code)\n2. Selects optimal chunking strategy\n3. Adapts the query for the content type\n4. Runs parallel sub-queries\n5. Returns aggregated results\n\n**Supported goals:**\n\n| Goal | Description |\n|------|-------------|\n| `summarize` | Summarize content purpose and key points |\n| `find_bugs` | Identify errors, issues, potential problems |\n| `extract_structure` | List functions, classes, schema, headings |\n| `security_audit` | Find vulnerabilities and security issues |\n| `answer:\u003cquestion\u003e` | Answer a custom question about the content |\n\n### Programmatic Analysis with `rlm_exec`\n\nFor deterministic pattern matching and data extraction, Claude can use `rlm_exec` to run Python code directly against a loaded context. This is closer to the paper's REPL approach and provides full control over analysis logic.\n\n**Tool**: `rlm_exec`\n\n**Purpose**: Execute arbitrary Python code against a loaded context in a sandboxed subprocess.\n\n**Parameters**:\n- `code` (required): Python code to execute. Set the `result` variable to capture output.\n- `context_name` (required): Name of a previously loaded context.\n- `timeout` (optional, default 30): Maximum execution time in seconds.\n\n**Features**:\n- Context available as read-only `context` variable\n- Pre-imported modules: `re`, `json`, `collections`\n- Subprocess isolation (won't crash the server)\n- Timeout enforcement\n- Works on any system with Python (no Docker needed)\n\n**Example — Finding patterns in a loaded context**:\n\n```python\n# After loading a context\nrlm_exec(\n    code=\"\"\"\nimport re\namounts = re.findall(r'\\$[\\d,]+', context)\nresult = {'count': len(amounts), 'sample': amounts[:5]}\n\"\"\",\n    context_name=\"bill\"\n)\n```\n\n**Example Response**:\n\n```json\n{\n  \"result\": {\n    \"count\": 1247,\n    \"sample\": [\"$500\", \"$1,000\", \"$250,000\", \"$100,000\", \"$50\"]\n  },\n  \"stdout\": \"\",\n  \"stderr\": \"\",\n  \"return_code\": 0,\n  \"timed_out\": false\n}\n```\n\n**Example — Extracting structured data**:\n\n```python\nrlm_exec(\n    code=\"\"\"\nimport re\nimport json\n\n# Find all email addresses\nemails = re.findall(r'\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}\\b', context)\n\n# Count by domain\nfrom collections import Counter\ndomains = [e.split('@')[1] for e in emails]\ndomain_counts = Counter(domains)\n\nresult = {\n    'total_emails': len(emails),\n    'unique_domains': len(domain_counts),\n    'top_domains': domain_counts.most_common(5)\n}\n\"\"\",\n    context_name=\"dataset\",\n    timeout=60\n)\n```\n\n**When to use `rlm_exec` vs `rlm_sub_query`**:\n\n| Use Case | Tool | Why |\n|----------|------|-----|\n| Extract all dates, IDs, amounts | `rlm_exec` | Regex is deterministic and fast |\n| Find security vulnerabilities | `rlm_sub_query` | Requires reasoning and context |\n| Parse JSON/XML structure | `rlm_exec` | Standard libraries work perfectly |\n| Summarize themes or tone | `rlm_sub_query` | Natural language understanding needed |\n| Count word frequencies | `rlm_exec` | Simple computation, no AI needed |\n| Answer \"Why did X happen?\" | `rlm_sub_query` | Requires inference and reasoning |\n\n**Tip**: For large contexts, combine both — use `rlm_exec` to filter/extract, then `rlm_sub_query` for semantic analysis of filtered results.\n\n## Providers\n\nBy default, sub-queries use **Claude Haiku 4.5** via the Claude Agent SDK. This works out-of-the-box if you have a Claude API key configured.\n\n| Provider | Default Model | Cost | Use Case |\n|----------|--------------|------|----------|\n| `claude-sdk` | claude-haiku-4-5 | ~$0.80/1M input | Default, works everywhere |\n| `ollama` | olmo-3.1:32b | $0 | Local inference, requires Ollama |\n\n## Recursive Sub-Queries\n\nThe `rlm_sub_query` and `rlm_sub_query_batch` tools support hierarchical decomposition via the `max_depth` parameter:\n\n- `max_depth=0` (default): Flat call, no recursion\n- `max_depth=1-5`: Sub-LLM can use RLM tools (chunk, filter, sub_query, etc.)\n\nExample: Analyzing a massive codebase with 2-level recursion:\n```python\nrlm_sub_query(\n    query=\"Find all security vulnerabilities\",\n    context_name=\"codebase\",\n    chunk_index=0,\n    max_depth=2  # Allow sub-queries to further decompose\n)\n```\n\n**How it works:**\n1. When `max_depth \u003e 0`, the sub-LLM receives RLM tools in its function calling context\n2. If the sub-LLM decides to use a tool (e.g., `rlm_chunk_context`), the agent loop handles it\n3. Each recursive call decrements the depth limit until `max_depth` is reached\n4. The response includes recursion metadata: `depth_reached` and `call_trace`\n\n**Recommended for recursive calls**: Use a local model like `gemma3:27b` via Ollama to avoid cost escalation from deep recursion.\n\n## Using Ollama (Free Local Inference)\n\nWith [Ollama](https://ollama.ai) installed locally, you can run sub-queries at zero cost:\n\n1. **Install Ollama** and pull a model:\n   ```bash\n   ollama pull gemma3:27b\n   ```\n\n2. **Add Ollama URL** to your MCP config:\n   ```json\n   {\n     \"mcpServers\": {\n       \"rlm\": {\n         \"command\": \"uv\",\n         \"args\": [\"run\", \"--directory\", \"/Users/your_username/projects/rlm\", \"python\", \"-m\", \"src.rlm_mcp_server\"],\n         \"env\": {\n           \"RLM_DATA_DIR\": \"/Users/your_username/.rlm-data\",\n           \"OLLAMA_URL\": \"http://localhost:11434\"\n         }\n       }\n     }\n   }\n   ```\n\n3. **Specify provider** in sub-queries:\n   ```\n   rlm_sub_query(\n       query=\"Summarize this section\",\n       context_name=\"my_doc\",\n       chunk_index=0,\n       provider=\"ollama\"  # Use local Ollama instead of default claude-sdk\n   )\n   ```\n\nOr for batch processing:\n```\nrlm_sub_query_batch(\n    query=\"Extract key points\",\n    context_name=\"my_doc\",\n    chunk_indices=[0, 1, 2, 3],\n    provider=\"ollama\",  # Use local Ollama instead of default claude-sdk\n    concurrency=4\n)\n```\n\n## Usage Examples\n\n### Basic Pattern\n\n```\n# 1. Load a large document\nrlm_load_context(name=\"report\", content=\u003clarge document\u003e)\n\n# 2. Inspect structure\nrlm_inspect_context(name=\"report\", preview_chars=500)\n\n# 3. Chunk into manageable pieces\nrlm_chunk_context(name=\"report\", strategy=\"paragraphs\", size=1)\n\n# 4. Sub-query chunks in parallel\nrlm_sub_query_batch(\n    query=\"What is the main topic? Reply in one sentence.\",\n    context_name=\"report\",\n    chunk_indices=[0, 1, 2, 3],\n    concurrency=4  # uses claude-sdk by default\n)\n\n# 5. Store results for aggregation\nrlm_store_result(name=\"topics\", result=\u003cresponse\u003e)\n\n# 6. Retrieve all results\nrlm_get_results(name=\"topics\")\n```\n\n### Analyzing Encyclopedia Britannica (11MB)\n\nThe flagship example of RLM capabilities — processing the Encyclopedia Britannica, 11th Edition from Project Gutenberg:\n\n```python\n# Load the full encyclopedia (11MB, ~2M tokens)\ncontent = open(\"docs/encyclopedia/merged_encyclopedia.txt\").read()\nrlm_load_context(name=\"encyclopedia\", content=content)\n\n# Inspect\nrlm_inspect_context(name=\"encyclopedia\")\n# → 11MB, 184K lines, ~2M tokens\n\n# Chunk for processing\nrlm_chunk_context(name=\"encyclopedia\", strategy=\"paragraphs\", size=30)\n\n# Query across the corpus\nrlm_sub_query_batch(\n    query=\"Summarize the main topics in this section\",\n    context_name=\"encyclopedia\",\n    chunk_indices=[0, 50, 100, 150],\n    provider=\"claude-sdk\"  # or \"ollama\" for free local inference\n)\n```\n\n**Extract Topic Catalog**:\n\n```python\n# Filter for specific subject\nrlm_filter_context(\n    name=\"encyclopedia\",\n    output_name=\"botany\",\n    pattern=\"(?i)(botan|plant|flora|flower|genus)\",\n    mode=\"keep\"\n)\n\n# Analyze filtered content\nrlm_auto_analyze(\n    name=\"botany_analysis\",\n    content=filtered_content,\n    goal=\"answer:List all botanical articles with brief descriptions\"\n)\n```\n\n| Metric | Value |\n|--------|-------|\n| File size | 11 MB |\n| Lines | 184,000 |\n| Tokens | ~2M |\n| Processing cost | $0 (Ollama) or ~$1.60 (Haiku) |\n\n## Data Storage\n\n```\n$RLM_DATA_DIR/\n├── contexts/     # Raw contexts (.txt + .meta.json)\n├── chunks/       # Chunked versions (by context name)\n└── results/      # Stored sub-call results (.jsonl)\n```\n\nContexts persist across sessions. Chunked contexts are cached for reuse.\n\n## Architecture\n\n```\nClaude Code\n    │\n    ▼\nRLM MCP Server\n    │\n    ├─► claude-sdk (Haiku 4.5) ─► Anthropic API\n    │\n    └─► ollama ─► Local LLM (gemma3:27b, llama3, etc.)\n```\n\nThe key insight: **context stays external, not in your prompt**. Claude orchestrates; sub-models analyze.\n\n## For Contributors: Learning the Codebase\n\nUse these prompts with Claude Code to explore the codebase and learn RLM patterns. The code is the single source of truth.\n\n### Understanding the Tools\n\n```\nRead src/rlm_mcp_server.py and list all RLM tools with their parameters and purpose.\n```\n\n```\nExplain the chunking strategies available in rlm_chunk_context.\nWhen would I use each one?\n```\n\n```\nWhat's the difference between rlm_sub_query and rlm_sub_query_batch?\nShow me the implementation.\n```\n\n### Understanding the Architecture\n\n```\nRead src/rlm_mcp_server.py and explain how contexts are stored and persisted.\nWhere does the data live?\n```\n\n```\nHow does the claude-sdk provider extract text from responses?\nWalk me through _call_claude_sdk.\n```\n\n```\nWhat happens when I call rlm_load_context? Trace the full flow.\n```\n\n### Hands-On Learning\n\n```\nLoad the README as a context, chunk it by paragraphs,\nand run a sub-query on the first chunk to summarize it.\n```\n\n```\nShow me how to process a large file in parallel using rlm_sub_query_batch.\nUse a real example.\n```\n\n```\nI have a 1MB log file. Walk me through the RLM pattern to extract all errors.\n```\n\n### Extending RLM\n\n```\nRead the test file and explain what scenarios are covered.\nWhat edge cases should I be aware of?\n```\n\n```\nHow would I add a new chunking strategy (e.g., by regex delimiter)?\nShow me where to modify the code.\n```\n\n```\nHow would I add a new provider (e.g., OpenAI)?\nWhat functions need to change?\n```\n\n## Test Corpus: Encyclopedia Britannica\n\nThe repository includes excerpts from the **Encyclopedia Britannica, 11th Edition** (1910-1911) from Project Gutenberg for testing RLM capabilities on large reference documents.\n\n### Included Files\n\n| File | Size | Description |\n|------|------|-------------|\n| `docs/encyclopedia/merged_encyclopedia.txt` | 11MB | All slices merged (~2M tokens) |\n\n### Using for Testing\n\n```python\n# Load the full encyclopedia\ncontent = open(\"docs/encyclopedia/merged_encyclopedia.txt\").read()\nrlm_load_context(name=\"encyclopedia\", content=content)\n\n# Inspect\nrlm_inspect_context(name=\"encyclopedia\")\n# → 11MB, 184K lines, ~2M tokens\n\n# Chunk for processing\nrlm_chunk_context(name=\"encyclopedia\", strategy=\"paragraphs\", size=30)\n\n# Query across the corpus\nrlm_sub_query_batch(\n    query=\"Summarize the main topics in this section\",\n    context_name=\"encyclopedia\",\n    chunk_indices=[0, 50, 100, 150],\n    provider=\"claude-sdk\"  # or \"ollama\" for free local inference\n)\n```\n\n### Example: Extract Topic Catalog\n\n```python\n# Filter for specific subject\nrlm_filter_context(\n    name=\"encyclopedia\",\n    output_name=\"botany\",\n    pattern=\"(?i)(botan|plant|flora|flower|genus)\",\n    mode=\"keep\"\n)\n\n# Analyze filtered content\nrlm_auto_analyze(\n    name=\"botany_analysis\",\n    content=filtered_content,\n    goal=\"answer:List all botanical articles with brief descriptions\"\n)\n```\n\n### Download More Slices\n\nAdditional encyclopedia volumes available from Project Gutenberg:\n- Search: https://www.gutenberg.org/ebooks/search/?query=encyclopaedia+britannica+11th\n- ~130 slices available covering A-Z\n\n### Attribution\n\nEncyclopedia Britannica, 11th Edition (1910-1911) sourced from [Project Gutenberg](https://www.gutenberg.org/). Public domain.\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frichardwhiteii%2Frlm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frichardwhiteii%2Frlm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frichardwhiteii%2Frlm/lists"}