https://github.com/jhsu/ai-rlm
A TypeScript implementation of the RLM (Recursive Language Model) inference strategy using the AI SDK
https://github.com/jhsu/ai-rlm
ai ai-sdk rlm
Last synced: 2 months ago
JSON representation
A TypeScript implementation of the RLM (Recursive Language Model) inference strategy using the AI SDK
- Host: GitHub
- URL: https://github.com/jhsu/ai-rlm
- Owner: jhsu
- License: mit
- Created: 2026-02-14T00:08:57.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2026-03-14T13:19:28.000Z (3 months ago)
- Last Synced: 2026-03-14T17:21:29.224Z (3 months ago)
- Topics: ai, ai-sdk, rlm
- Language: TypeScript
- Homepage: https://www.npmjs.com/package/ai-rlm
- Size: 133 KB
- Stars: 4
- Watchers: 0
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# ai-rlm
[](https://www.npmjs.com/package/ai-rlm)
RLM (Recursive Language Model) provided via ai-sdk Agent or tool.
Based on the paper "Recursive Language Models" by Zhang, Kraska, and Khattab (2025).
## Overview
RLM is an inference strategy where LLMs treat long contexts as part of an external environment rather than feeding them directly to the model. The LLM writes JavaScript code to programmatically examine, decompose, and recursively call sub-LLMs over snippets.
### Key Features
- **Iterative Code Execution**: The model writes JavaScript code, sees output, then writes more code
- **Sub-LLM Queries**: Access to `llm_query()` and `llm_query_batched()` for semantic analysis
- **Context Management**: Efficient handling of large contexts through chunking
- **Sandboxed REPL**: JavaScript execution in a sandboxed QuickJS WebAssembly context
- **Pluggable Sandbox Interface**: Swap the execution environment with your own sandbox implementation
- **AI SDK Integration**: Works as an Agent or Tool with the Vercel AI SDK
- **Multiple Usage Patterns**: Use as standalone agent or as a tool in larger workflows
## Installation
```bash
npm install ai-rlm ai zod @ai-sdk/openai
```
`ai` and `zod` are peer dependencies and must be installed in your project.
The `model` and `subModel` settings accept any AI SDK `LanguageModel` — use any provider ([OpenAI](https://sdk.vercel.ai/providers/ai-sdk-providers/openai), [Anthropic](https://sdk.vercel.ai/providers/ai-sdk-providers/anthropic), [Google](https://sdk.vercel.ai/providers/ai-sdk-providers/google-generative-ai), etc.).
## Usage
### As Agent (Recommended)
The **RLMAgent** class provides a clean, agent-based API that integrates seamlessly with the AI SDK:
```typescript
import { RLMAgent } from 'ai-rlm';
import { openai } from '@ai-sdk/openai';
// Create agent
const agent = new RLMAgent({
model: openai('gpt-4.1'), // Root agent model
subModel: openai('gpt-4.1-mini'), // Sub-LLM model for queries
maxIterations: 20, // Max REPL iterations
maxLLMCalls: 50, // Max sub-LLM calls
});
// Process a context
const context = `
The quick brown fox jumps over the lazy dog.
The magic number is 42.
`;
const query = 'What is the magic number?';
const result = await agent.generate({
prompt: query,
options: { context },
});
const rlmResult = result.output;
console.log('Answer:', result.text);
console.log('Iterations:', rlmResult.iterations);
console.log('LLM Calls:', rlmResult.llmCallCount);
console.log('Steps:', rlmResult.steps); // Full trajectory
```
### As Tool
Use **createRLMTool** to create an AI SDK-compatible tool for use with `generateText` or `ToolLoopAgent`:
```typescript
import { createRLMTool } from 'ai-rlm';
import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';
// Create the tool
const rlmTool = createRLMTool({
model: openai('gpt-4.1'),
subModel: openai('gpt-4.1-mini'),
});
// Use in generateText
const result = await generateText({
model: openai('gpt-4.1'),
tools: { analyzeLargeContext: rlmTool },
prompt: 'Analyze this large codebase for security vulnerabilities',
});
```
### With ToolLoopAgent
```typescript
import { ToolLoopAgent } from 'ai';
import { createRLMTool } from 'ai-rlm';
import { openai } from '@ai-sdk/openai';
const agent = new ToolLoopAgent({
model: openai('gpt-4.1'),
tools: {
analyzeLargeContext: createRLMTool({
model: openai('gpt-4.1'),
subModel: openai('gpt-4.1-mini'),
}),
// ... other tools
},
});
const result = await agent.generate({
prompt: 'Check this document for compliance issues',
});
```
### Streaming Support
```typescript
const stream = await agent.stream({
prompt: 'Analyze this',
options: { context: largeDocument },
});
// textStream emits the final text after generate() completes
const reader = stream.textStream.getReader();
while (true) {
const { done, value } = await reader.read();
if (done) break;
process.stdout.write(value);
}
```
## How It Works
The RLM agent writes JavaScript code to explore the context in an iterative loop:
```javascript
// First, explore the context
console.log('Context length:', context.length);
console.log('First 200 chars:', context.substring(0, 200));
// Search for specific patterns
const lines = context.split('\n');
const targetLine = lines.find(line => line.includes('magic number'));
console.log('Found:', targetLine);
// Store result for later
const answer = targetLine?.match(/magic number is (\d+)/)?.[1];
// Submit answer
FINAL_VAR(answer)
```
1. **Context Loading**: The context is loaded into a sandboxed JavaScript REPL environment
2. **Iterative Reasoning**: The root LLM writes JavaScript code to explore the context
3. **Code Execution**: Code is executed in a QuickJS WebAssembly sandbox with a 30s timeout
4. **Sub-LLM Queries**: For semantic analysis, `llm_query()` delegates to a sub-model
5. **Result Accumulation**: The model iterates until it finds an answer
6. **Final Answer**: The model submits an answer using `FINAL(answer)` or `FINAL_VAR(variable_name)`
### System Prompt
The RLM system prompt instructs the model to:
- EXPLORE FIRST - Look at data before processing
- ITERATE - Write small code snippets, observe outputs
- VERIFY BEFORE SUBMITTING - Check results are correct
- USE llm_query FOR SEMANTICS - Code finds WHERE; LLM understands WHAT
- CHUNK SMARTLY - Feed substantial chunks to sub-LLMs (~500K chars)
## REPL Sandbox
The JavaScript REPL runs code in a QuickJS WebAssembly sandboxed context:
### Available in the Sandbox:
- **`context`**: The input context (string or object)
- **`console.log()` / `console.error()`**: Output logging
- **`llm_query(prompt)`**: Query a sub-LLM for semantic analysis
- **`llm_query_batched(prompts)`**: Query multiple sub-LLMs
- **`FINAL(answer)`**: Submit final answer directly
- **`FINAL_VAR(varName)`**: Submit a variable from the REPL
- **Standard JavaScript**: All ES6+ features, Array methods, String methods, Math, JSON, etc.
### Security Features:
- 30-second timeout on code execution
- No access to Node.js built-in modules or file system
- No network access
- Sandboxed console output capture
### Custom Sandbox Implementations
`RLMAgent` supports user-defined sandboxes through `sandboxFactory`.
```typescript
import {
RLMAgent,
createQuickJSSandbox,
type RLMSandbox,
type RLMSandboxFactoryOptions,
} from 'ai-rlm';
import { openai } from '@ai-sdk/openai';
const sandboxFactory = (options: RLMSandboxFactoryOptions): RLMSandbox => {
// Wrap the default QuickJS sandbox, or return your own implementation.
return createQuickJSSandbox(options);
};
const agent = new RLMAgent({
model: openai('gpt-4.1'),
subModel: openai('gpt-4.1-mini'),
sandboxFactory,
});
```
### Logging
Library diagnostics are silent by default. If you want internal agent logs, pass an explicit logger and log level:
```typescript
const agent = new RLMAgent({
model: openai('gpt-4.1'),
subModel: openai('gpt-4.1-mini'),
logger: console,
logLevel: 'debug',
});
```
Use this for local debugging. In application code, prefer wiring `logger` to your app's logging system rather than relying on `console`.
Your sandbox must implement:
```typescript
interface RLMSandbox {
loadContext(context: RLMContext): Promise;
executeJavaScript(code: string): Promise<{
stdout: string;
stderr: string;
error?: string;
result?: unknown;
}>;
getVariable(name: string): unknown;
getLLMCallCount(): number;
getUsageSummary(): RLMUsageSummary;
cleanup(): void;
}
```
Custom sandbox factories are also propagated to recursive `sub_rlm()` calls.
## API Reference
### RLMAgent
The primary class for using RLM as an agent.
#### `constructor(settings: RLMAgentSettings)`
```typescript
import type { LanguageModel } from 'ai';
interface RLMAgentSettings {
model: LanguageModel; // Required: Root agent model
subModel?: LanguageModel; // Optional: Sub-LLM model (defaults to model)
maxIterations?: number; // Max REPL iterations (default: 20)
maxLLMCalls?: number; // Max sub-LLM calls (default: 50)
maxOutputChars?: number; // Max REPL output chars (default: 100000)
maxHistoryPreview?: number; // Max output preview chars in model history (default: 500)
prepareIteration?: (ctx) => PrepareIterationResult | void | Promise;
prepareSubAgent?: (ctx) => PrepareSubAgentResult | void | Promise;
logger?: RLMLogger; // Optional injected logger
logLevel?: RLMLogLevel; // Log level for internal diagnostics (default: "silent")
sandboxFactory?: RLMSandboxFactory; // Optional custom sandbox factory
}
```
#### `async generate(options): Promise`
Generate an answer by iteratively analyzing the context.
**Parameters:**
```typescript
interface RLMAgentCallParameters {
context: RLMContext; // The large context to analyze
query: string; // The question or task
abortSignal?: AbortSignal; // Optional abort signal
timeout?: number; // Optional timeout in ms
onStepFinish?: (step: REPLStep) => void; // Callback for each step
}
```
**Returns:**
```typescript
interface RLMGenerateResult {
text: string; // The generated answer
steps: REPLStep[]; // Array of REPL steps taken
llmCallCount: number; // Total LLM calls made
iterations: number; // Total iterations performed
usage: RLMUsageSummary; // Aggregated token usage across root + sub-calls
}
interface REPLStep {
iteration: number;
reasoning: string; // The model's reasoning before code
code: string; // JavaScript code executed
output: string; // Console output and results
}
```
#### `async stream(options): Promise`
Run `generate()` and emit AI SDK-style stream parts for iteration progress and final text output.
**Returns:**
```typescript
interface RLMStreamResult extends RLMGenerateResult {
textStream: ReadableStream; // Emits text-delta content
fullStream: ReadableStream>; // Emits start/start-step/finish-step/text/finish events
}
```
### createRLMTool
Factory function to create RLM as an AI SDK-compatible tool.
#### `createRLMTool(config?: RLMToolConfig)`
```typescript
import type { LanguageModel } from 'ai';
function createRLMTool(config?: {
model?: LanguageModel; // Root agent model
subModel?: LanguageModel; // Sub-LLM model
maxIterations?: number; // Max iterations (default: 20)
maxLLMCalls?: number; // Max LLM calls (default: 50)
maxOutputChars?: number; // Max output chars (default: 100000)
logger?: RLMLogger; // Optional injected logger
logLevel?: RLMLogLevel; // Log level for internal diagnostics
}): Tool
```
**Tool Input Schema:**
```typescript
{
context: string | string[] | Record;
query: string;
maxIterations?: number; // Optional override
maxLLMCalls?: number; // Optional override
}
```
**Tool Output:**
```typescript
{
answer: string; // The generated answer
iterations: number; // Number of iterations
stepsTaken: number; // Number of steps executed
}
```
### RLMContext
Context can be any of these formats:
```typescript
type RLMContext = string | string[] | Record;
```
- `string`: Raw text document
- `string[]`: Array of lines or documents
- `Record`: JSON/structured data
## Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ RLMAgent Class │
├─────────────────────────────────────────────────────────────┤
│ ┌───────────────────────────────────────────────────────┐ │
│ │ REPL Environment (QuickJS) │ │
│ │ - Sandboxed JavaScript execution │ │
│ │ - llm_query() for sub-LLM semantic analysis │ │
│ │ - 30s timeout protection │ │
│ └───────────────────────────────────────────────────────┘ │
│ │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ generate() Method │ │
│ │ 1. Generate reasoning + JS code │ │
│ │ 2. Execute in sandboxed context │ │
│ │ 3. Process llm_query markers → real LLM calls │ │
│ │ 4. Check for FINAL() answer │ │
│ │ 5. Repeat or return answer │ │
│ └───────────────────────────────────────────────────────┘ │
│ │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ stream() Method │ │
│ │ - Delegates to generate() │ │
│ │ - Emits start-step / finish-step progress events │ │
│ │ - Emits text-start / text-delta / text-end / finish │ │
│ └───────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
│ createRLMTool()
▼
┌──────────────────────┐
│ AI SDK Tool │
│ - Tool interface │
│ - Input validation │
│ - Auto-execution │
└──────────────────────┘
```
## Examples
Run the examples:
```bash
# Basic agent examples
bun run examples/basic-usage.ts
# Tool integration examples
bun run examples/tool-usage.ts
# Individual examples
bun run -e "import { example1SimpleTextSearch } from './examples/basic-usage.ts'; example1SimpleTextSearch()"
```
## CLI Codebase Search
This repo includes a local CLI script for searching a codebase with `RLMAgent`.
The CLI now uses a `ToolLoopAgent` orchestrator with tools:
- `list_files`
- `search_files`
- `read_file`
- `analyze_with_rlm` (deep analysis on selected files)
This avoids preloading the entire repository into one context window.
```bash
npm run code-search -- ./path/to/codebase "Where is authentication handled?"
```
You can also run the bin directly:
```bash
node ./bin/rlm-codebase-search.js ./path/to/codebase "How are API routes defined?"
```
Required environment variable:
```bash
export OPENAI_API_KEY="your_key_here"
```
### Example Files
- **`examples/basic-usage.ts`**: Agent API examples (generate, stream, callbacks)
- **`examples/tool-usage.ts`**: Tool API examples (with generateText, ToolLoopAgent)
- **`examples/document-comparison.ts`**: Document diffing example
- **`examples/data-transformation.ts`**: Data extraction and transformation
## License
MIT
## References
- Paper: "Recursive Language Models" (Zhang, Kraska, Khattab, 2025)
- AI SDK Documentation: https://sdk.vercel.ai/docs