https://github.com/myprototypewhat/context-chef

Context compiler for TypeScript/JavaScript AI agents. Automatically compiles agent state into optimized LLM payloads with history compression, tool pruning, multi-provider support, and more.
https://github.com/myprototypewhat/context-chef
ai-agent ai-sdk anthropic compression context-engineering context-window-optimization gemini llm middleware monorepo openai typescript vercel-ai
Last synced: 4 days ago
JSON representation
Context compiler for TypeScript/JavaScript AI agents. Automatically compiles agent state into optimized LLM payloads with history compression, tool pruning, multi-provider support, and more.
Host: GitHub
URL: https://github.com/myprototypewhat/context-chef
Owner: MyPrototypeWhat
Created: 2026-02-21T16:44:21.000Z (3 months ago)
Default Branch: main
Last Pushed: 2026-04-05T04:40:22.000Z (about 2 months ago)
Last Synced: 2026-04-05T06:33:03.969Z (about 2 months ago)
Topics: ai-agent, ai-sdk, anthropic, compression, context-engineering, context-window-optimization, gemini, llm, middleware, monorepo, openai, typescript, vercel-ai
Language: TypeScript
Homepage:
Size: 33.1 MB
Stars: 2
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
Awesome Lists containing this project

README

          # ContextChef

[![npm version](https://img.shields.io/npm/v/@context-chef/core.svg)](https://www.npmjs.com/package/@context-chef/core)

[![@context-chef/core Downloads](https://img.shields.io/npm/dm/@context-chef/core.svg?label=%40context-chef%2Fcore%20downloads)](https://www.npmjs.com/package/@context-chef/core)

[![@context-chef/ai-sdk-middleware Downloads](https://img.shields.io/npm/dm/@context-chef/ai-sdk-middleware.svg?label=%40context-chef%2Fai-sdk-middleware%20downloads)](https://www.npmjs.com/package/@context-chef/ai-sdk-middleware)

[![@context-chef/tanstack-ai Downloads](https://img.shields.io/npm/dm/@context-chef/tanstack-ai.svg?label=%40context-chef%2Ftanstack-ai%20downloads)](https://www.npmjs.com/package/@context-chef/tanstack-ai)

[![License](https://img.shields.io/npm/l/@context-chef/core.svg)](https://github.com/MyPrototypeWhat/context-chef/blob/main/LICENSE)

[![TypeScript](https://img.shields.io/badge/TypeScript-5.9-blue.svg)](https://www.typescriptlang.org/)

[![CI](https://github.com/MyPrototypeWhat/context-chef/actions/workflows/ci.yml/badge.svg)](https://github.com/MyPrototypeWhat/context-chef/actions/workflows/ci.yml)



  



Context compiler for TypeScript/JavaScript AI agents.

ContextChef solves the most common context engineering problems in AI agent development: conversations too long for the model to remember, too many tools causing hallucinations, having to rewrite prompts when switching providers, and state drift in long-running tasks. It doesn't take over your control flow — it just compiles your state into an optimal payload before each LLM call.

[中文文档](./README.zh-CN.md)

## Packages

| Package | Description |

|---|---|

| [`@context-chef/core`](./packages/core) | Core context compiler — history compression, tool pruning, memory, VFS offloading, multi-provider adapters |

| [`@context-chef/ai-sdk-middleware`](./packages/ai-sdk-middleware) | [Vercel AI SDK](https://sdk.vercel.ai) middleware — drop-in context engineering with zero code changes |

| [`@context-chef/tanstack-ai`](./packages/tanstack-ai) | [TanStack AI](https://tanstack.com/ai) middleware — compression, truncation, and dynamic state via `ChatMiddleware` |

### Zero-config AI SDK integration

If you use the Vercel AI SDK, you can get transparent history compression and tool result truncation with just 2 lines:

```typescript

import { withContextChef } from '@context-chef/ai-sdk-middleware';

import { openai } from '@ai-sdk/openai';

import { generateText } from 'ai';

const model = withContextChef(openai('gpt-4o'), {

  contextWindow: 128_000,

  compress: { model: openai('gpt-4o-mini') },

  truncate: { threshold: 5000 },

});

// Everything below stays exactly the same

const result = await generateText({ model, messages, tools });

```

See the [`@context-chef/ai-sdk-middleware` README](./packages/ai-sdk-middleware/README.md) for full documentation.

### TanStack AI middleware

If you use TanStack AI, drop in the middleware for transparent context management:

```typescript

import { contextChefMiddleware } from '@context-chef/tanstack-ai';

import { chat } from '@tanstack/ai';

import { openaiText } from '@tanstack/ai-openai';

const stream = chat({

  adapter: openaiText('gpt-4o'),

  messages,

  middleware: [

    contextChefMiddleware({

      contextWindow: 128_000,

      compress: { adapter: openaiText('gpt-4o-mini') },

      truncate: { threshold: 5000 },

    }),

  ],

});

```

See the [`@context-chef/tanstack-ai` README](./packages/tanstack-ai/README.md) for full documentation.

### Full control with `@context-chef/core`

For direct control over the compilation pipeline — dynamic state injection, tool namespaces, memory, snapshot/restore — use the core library directly:

## Blog Series

1. [Why "Compile" Your Context](https://myprototypewhat.cn/context-chef-1-why-compile-context-en)

2. [Janitor — Separating Trigger Logic from Compression Policy](https://myprototypewhat.cn/context-chef-2-janitor-en)

3. [Pruner — Decoupling Tool Registration from Routing](https://myprototypewhat.cn/context-chef-3-pruner-en)

4. [Offloader/VFS — Relocate Information, Don't Destroy It](https://myprototypewhat.cn/context-chef-4-offloader-vfs-en)

5. [Core Memory — Zero-Cost Reads, Structured Writes](https://myprototypewhat.cn/context-chef-5-core-memory-en)

6. [Snapshot & Restore — Capture Everything That Determines the Next Compile](https://myprototypewhat.cn/context-chef-6-snapshot-en)

7. [The Provider Adapter Layer — Let Differences Stop at Compile Time](https://myprototypewhat.cn/context-chef-7-adapters-en)

8. [Five Extension Points in the Compile Pipeline](https://myprototypewhat.cn/context-chef-8-hooks-en)

## Features

- **Conversations too long?** — Automatically compress history, preserve recent memory, delegate old messages to a small model for summarization

- **Too many tools?** — Dynamically prune the tool list per task, or use a two-layer architecture (stable namespaces + on-demand loading) to eliminate tool hallucinations

- **Need to block tools at runtime?** — Pruner blocklist + `checkToolCall` gate for permission, environment safety, rate limits, and sandboxing — KV-cache preserving by default

- **Mode-based behavior?** — `Skill` primitive bundles instructions and tool annotations per phase; loadable from `SKILL.md` files (compatible with Claude Code / Mastra / OpenCode formats)

- **Switching providers?** — Same prompt architecture compiles to OpenAI / Anthropic / Gemini with automatic prefill, cache, and tool call format adaptation

- **Long tasks drifting?** — Zod schema-based state injection forces the model to stay aligned with the current task on every call

- **Terminal output too large?** — Auto-truncate and offload to VFS, keeping error lines + a `context://` URI pointer for on-demand retrieval

- **Can't remember across sessions?** — Memory lets the model persist key information (project rules, user preferences) via tool calls, auto-injected on the next session

- **Need to rollback?** — Snapshot & Restore captures and rolls back full context state for branching and exploration

- **Need external context?** — `onBeforeCompile` hook lets you inject RAG results, AST snippets, or MCP queries before compilation

- **Need observability?** — Unified event system (`chef.on('compress', ...)`) for logging, metrics, and debugging across all internal modules

## Installation

```bash

npm install @context-chef/core zod

```

## Quick Start

```typescript

import { ContextChef } from "@context-chef/core";

import { z } from "zod";

const TaskSchema = z.object({

  activeFile: z.string(),

  todo: z.array(z.string()),

});

const chef = new ContextChef({

  janitor: {

    contextWindow: 200000,

    compressionModel: async (msgs) => callGpt4oMini(msgs),

  },

});

const payload = await chef

  .setSystemPrompt([

    {

      role: "system",

      content: "You are an expert coder.",

      _cache_breakpoint: true,

    },

  ])

  .setHistory(conversationHistory)

  .setDynamicState(TaskSchema, {

    activeFile: "auth.ts",

    todo: ["Fix login bug"],

  })

  .withGuardrails({

    enforceXML: { outputTag: "response" },

    prefill: "\n1.",

  })

  .compile({ target: "anthropic" });

const response = await anthropic.messages.create(payload);

```

---

## API Reference

### `new ContextChef(config?)`

```typescript

const chef = new ContextChef({

  vfs?: { threshold?: number, storageDir?: string, maxAge?: number, maxFiles?: number, maxBytes?: number, onVFSEvicted?: (entry, reason) => void },

  janitor?: JanitorConfig,

  pruner?: { strategy?: 'union' | 'intersection' },

  memory?: MemoryConfig,

  transformContext?: (messages: Message[]) => Message[] | Promise,

  onBeforeCompile?: (context: BeforeCompileContext) => string | null | Promise,

});

```

### Context Building

#### `chef.setSystemPrompt(messages): this`

Sets the static system prompt layer. Cached prefix — should rarely change.

```typescript

chef.setSystemPrompt([

  {

    role: "system",

    content: "You are an expert coder.",

    _cache_breakpoint: true,

  },

]);

```

`_cache_breakpoint: true` tells the Anthropic adapter to inject `cache_control: { type: 'ephemeral' }`.

#### `chef.setHistory(messages): this`

Sets the conversation history. Janitor compresses automatically on `compile()`.

#### `chef.setDynamicState(schema, data, options?): this`

Injects Zod-validated state as XML into the context.

```typescript

const TaskSchema = z.object({

  activeFile: z.string(),

  todo: z.array(z.string()),

});

chef.setDynamicState(TaskSchema, { activeFile: "auth.ts", todo: ["Fix bug"] });

// placement defaults to 'last_user' (injected into the last user message)

// use { placement: 'system' } for a standalone system message

```

#### `chef.withGuardrails(options): this`

Applies output format guardrails and optional prefill.

```typescript

chef.withGuardrails({

  enforceXML: { outputTag: "final_code" }, // wraps output rules in EPHEMERAL_MESSAGE

  prefill: "\n1.", // trailing assistant message (auto-degraded for OpenAI/Gemini)

});

```

#### `chef.compile(options?): Promise`

Compiles everything into a provider-ready payload. Triggers Janitor compression. Registered tools are auto-included.

```typescript

const payload = await chef.compile({ target: "openai" }); // OpenAIPayload

const payload = await chef.compile({ target: "anthropic" }); // AnthropicPayload

const payload = await chef.compile({ target: "gemini" }); // GeminiPayload

```

---

### History Compression (Janitor)

Janitor provides two compression paths. Choose the one that fits your setup:

#### Path 1: Tokenizer (precise control)

Provide your own token counting function for precise per-message calculation. Janitor preserves recent messages that fit within `contextWindow × preserveRatio` and compresses the rest.

```typescript

const chef = new ContextChef({

  janitor: {

    contextWindow: 200000,

    tokenizer: (msgs) =>

      msgs.reduce((sum, m) => sum + encode(m.content).length, 0),

    preserveRatio: 0.8, // keep 80% of contextWindow for recent messages (default)

    compressionModel: async (msgs) => callGpt4oMini(msgs),

    onCompress: async (summary, count) => {

      await db.saveCompression(sessionId, summary, count);

    },

  },

});

```

#### Path 2: reportTokenUsage (simple, no tokenizer needed)

Most LLM APIs return token usage in their response. Feed that value back — when it exceeds `contextWindow`, Janitor compresses everything except the last N messages.

```typescript

const chef = new ContextChef({

  janitor: {

    contextWindow: 200000,

    preserveRecentMessages: 1,       // keep last 1 message on compression (default)

    compressionModel: async (msgs) => callGpt4oMini(msgs),

  },

});

// After each LLM call:

const response = await openai.chat.completions.create({ ... });

chef.reportTokenUsage(response.usage.prompt_tokens);

```

> **Note:** Without a `compressionModel`, old messages are discarded with no summary. A console warning is printed at construction time if neither `tokenizer` nor `compressionModel` is provided.

#### `JanitorConfig`

| Option                          | Type                                        | Default    | Description                                                                                  |

| ------------------------------- | ------------------------------------------- | ---------- | -------------------------------------------------------------------------------------------- |

| `contextWindow`                 | `number`                                    | _required_ | Model's context window size (tokens). Compression triggers when usage exceeds this.          |

| `tokenizer`                     | `(msgs: Message[]) => number`               | —          | Enables the tokenizer path for precise per-message token calculation.                        |

| `preserveRatio`                 | `number`                                    | `0.8`      | [Tokenizer path] Ratio of `contextWindow` to preserve for recent messages.                   |

| `preserveRecentMessages`        | `number`                                    | `1`        | [reportTokenUsage path] Number of recent turns to keep when compressing.                     |

| `usagePreference`               | `'max' \| 'feedFirst' \| 'tokenizerFirst'`  | `'max'`    | Which token source drives the trigger when both `tokenizer` and `reportTokenUsage` are set. Without `tokenizer`, the value union narrows to `'max' \| 'feedFirst'` — TypeScript rejects `'tokenizerFirst'` at compile time. See the [core package README](./packages/core) for the full breakdown. |

| `compressionModel`              | `(msgs: Message[]) => Promise`      | —          | Async hook to summarize old messages via a low-cost LLM.                                     |

| `customCompressionInstructions` | `string`                                    | —          | Additional focused instructions appended to the default compression prompt (additive, not replacement). |

| `onCompress`                    | `(summary, count) => void`                  | —          | Fires after compression with the summary message and truncated count.                        |

| `onBeforeCompress`              | `(history, tokenInfo) => Message[] \| null` | —          | Fires before LLM compression. Return modified history to intervene, or null to proceed normally. |

**Compression output contract.** Janitor's default prompt instructs the compression model to produce an `` scratchpad (stripped from the final output) followed by a structured `` block with 5 domain-agnostic sections (Task Overview / Current State / Important Discoveries / Next Steps / Context to Preserve). Raw output is piped through `Prompts.formatCompactSummary` before injection. See the [core package README](./packages/core) for the full contract and `customCompressionInstructions` usage.

**Circuit breaker.** If `compressionModel` throws three times in a row, `compress()` becomes a no-op until the next successful compression or an explicit `janitor.reset()` / `chef.clearHistory()`. The failure counter is preserved by `chef.snapshot()` / `chef.restore()`.

#### `chef.reportTokenUsage(tokenCount): this`

Feed the API-reported token count. On the next `compile()`, if this value exceeds `contextWindow`, compression is triggered. In the tokenizer path, the default is to take the higher of the local calculation and the fed value; switch via `usagePreference` if you want `'feedFirst'` (trust the API truth) or `'tokenizerFirst'` (ignore fed entirely).

```typescript

const response = await openai.chat.completions.create({ ... });

chef.reportTokenUsage(response.usage.prompt_tokens);

```

#### `onBeforeCompress` hook

Fires when the token budget is exceeded, **before** LLM compression. Return a modified `Message[]` to replace the history, or return `null` to let default compression proceed.

```typescript

const chef = new ContextChef({

  janitor: {

    contextWindow: 200000,

    tokenizer: (msgs) => countTokens(msgs),

    onBeforeCompress: (history, { currentTokens, limit }) => {

      // Example: offload large tool results to VFS before compression

      return history.map((msg) =>

        msg.role === "tool" && msg.content.length > 5000

          ? { ...msg, content: pointer.offload(msg.content).content }

          : msg,

      );

    },

  },

});

```

#### Mechanical Compaction (`compact`)

Strip content from history at zero LLM cost. Use proactively in your agent loop to keep context lean.

```typescript

// Clear all tool results and thinking blocks

history = janitor.compact(history, { clear: ['tool-result', 'thinking'] });

// Keep the 5 most recent tool results, clear the rest (min: 1)

history = janitor.compact(history, {

  clear: [{ target: 'tool-result', keepRecent: 5 }],

});

// Combine: clear old tool results + all thinking

history = janitor.compact(history, {

  clear: [{ target: 'tool-result', keepRecent: 5 }, 'thinking'],

});

```

#### `ensureValidHistory(history)`

Standalone utility that sanitizes message history to satisfy LLM API invariants (orphan tool result removal, missing tool result placeholder injection, first-non-system-must-be-user). Use when loading history from a database or after manual modifications.

```typescript

import { ensureValidHistory } from '@context-chef/core';

const safeHistory = ensureValidHistory(rawHistory);

chef.setHistory(safeHistory);

```

> **Boundary contract.** All input adapters (`fromOpenAI` / `fromAnthropic` / `fromGemini`, plus middleware-internal `fromAISDK` / `fromTanStackAI`) run their output through `ensureValidHistory` automatically — they're the system boundary between external SDK formats and ContextChef IR. `chef.setHistory(IR)` does NOT sanitize: IR is treated as an internal protocol, and history you construct (or mutate) directly is trusted to satisfy the invariants. Wrap with `ensureValidHistory(...)` explicitly when in doubt.

#### `chef.clearHistory(): this`

Explicitly clear history and reset Janitor state when switching topics or completing sub-tasks.

---

### Large Output Offloading (Offloader / VFS)

```typescript

// Offload if content exceeds threshold; preserves last 2000 chars by default

const safeLog = chef.offload(rawTerminalOutput);

history.push({ role: "tool", content: safeLog, tool_call_id: "call_123" });

// safeLog: original content if small, or truncated with context://vfs/ URI

// Preserve head (first 500 chars) + tail (last 1000 chars), snapped to line boundaries

const safeOutput = chef.offload(content, { headChars: 500, tailChars: 1000 });

// No preview content — just truncation notice + URI

const safeDoc = chef.offload(largeFileContent, { headChars: 0, tailChars: 0 });

// Override threshold per call

const safeOutput2 = chef.offload(content, { threshold: 2000, tailChars: 500 });

```

Register a tool for the LLM to read full content when needed:

```typescript

// In your tool handler:

import { Offloader } from "@context-chef/core";

const offloader = new Offloader({ storageDir: ".context_vfs" });

const fullContent = offloader.resolve(uri);

```

#### Cleanup & Lifecycle

`.context_vfs/` grows unboundedly without intervention. Configure caps and trigger cleanup yourself — never automatic.

```typescript

const chef = new ContextChef({

  vfs: {

    threshold: 5000,

    maxAge: 24 * 60 * 60 * 1000, // ms since createdAt

    maxFiles: 200,                // LRU evict by accessedAt

    maxBytes: 50 * 1024 * 1024,   // true UTF-8 size (Buffer.byteLength)

    onVFSEvicted: (entry, reason) => {

      // 'maxAge' | 'maxFiles' | 'maxBytes' — errors logged and swallowed

      logger.debug("evicted", entry.uri, reason);

    },

  },

});

// Manual sweep — call from your agent loop, on session end, or wire to compile:done.

const result = await chef.getOffloader().cleanupAsync();

// { evicted, evictedBytes, evictedByAge, evictedByCount, evictedByBytes, failed }

// Override caps for one call (Infinity disables a single cap).

await chef.getOffloader().cleanupAsync({ maxFiles: 0 }); // evict all over-age + all

```

After a process restart, `reconcile()` walks the adapter and adopts orphan files into the in-memory index so subsequent `cleanup()` can see them:

```typescript

const adopted = await chef.getOffloader().reconcileAsync({ measureBytes: true });

// createdAt parsed from filename (vfs__.txt); bytes measured if requested.

```

Cleanup is **mechanism, not policy** — it is never triggered by `compile()`. Wire it to `compile:done` for per-turn enforcement, or call it on a timer / on session end. Custom `VFSStorageAdapter` implementations must add optional `list()` / `delete()` methods to enable cleanup; if either is missing, `cleanup()` throws `VFSCleanupNotSupportedError` (the built-in `FileSystemAdapter` implements both).

> **Production patterns** — see [`docs/vfs-lifecycle-recipes.md`](./docs/vfs-lifecycle-recipes.md) for runnable recipes covering long-running servers, serverless cold-start `reconcile()`, AI SDK middleware integration, custom storage adapters (Redis example), and choosing your eviction strategy.

---

### Tool Management (Pruner)

#### Flat Mode

```typescript

chef.registerTools([

  { name: "read_file", description: "Read a file", tags: ["file", "read"] },

  { name: "run_bash", description: "Run a command", tags: ["shell"] },

  {

    name: "get_time",

    description: "Get timestamp" /* no tags = always kept */,

  },

]);

const { tools, removed } = chef.getPruner().pruneByTask("Read the auth.ts file");

// tools: [read_file, get_time]

```

Also supports `allowOnly(names)` and `pruneByTaskAndAllowlist(task, names)`.

#### Runtime Blocklist (Permission Gate)

Block specific tools at dispatch time without breaking KV cache. Useful for permission control, environment safety, sandboxing, rate limits, and feature flags. The compiled `tools` array stays unchanged — enforcement happens via `checkToolCall` in your agent loop.

```typescript

// Set policy (rare event — startup, on user role change, prod env, etc.)

chef.getPruner().setBlockedTools(["delete_file", "tail_logs"]);

// In your agent loop, gate every tool call before dispatch:

for (const call of response.tool_calls) {

  const check = chef.checkToolCall(call);

  if (!check.allowed) {

    history.push({

      role: "tool",

      tool_call_id: call.id,

      content: check.reason, // e.g. 'Tool "delete_file" is currently blocked.'

    });

    continue;

  }

  await executeTool(call);

}

```

`checkToolCall` returns a discriminated union (`ToolCallCheckResult`), so TypeScript guarantees `reason` is present iff the call is rejected. KV cache is preserved across blocklist changes — the LLM continues to see the full tool set; the gate is dispatch-side only.

#### Namespace + Lazy Loading (Two-Layer Architecture)

**Layer 1 — Namespaces**: Core tools grouped into stable tool definitions. The tool list never changes across turns.

**Layer 2 — Lazy Loading**: Long-tail tools registered as a lightweight XML directory. The LLM loads full schemas on demand via `load_toolkit`.

```typescript

// Layer 1: Stable namespace tools

chef.registerNamespaces([

  {

    name: "file_ops",

    description: "File system operations",

    tools: [

      {

        name: "read_file",

        description: "Read a file",

        parameters: { path: { type: "string" } },

      },

      {

        name: "write_file",

        description: "Write to a file",

        parameters: { path: { type: "string" }, content: { type: "string" } },

      },

    ],

  },

  {

    name: "terminal",

    description: "Shell command execution",

    tools: [

      {

        name: "run_bash",

        description: "Execute a command",

        parameters: { command: { type: "string" } },

      },

    ],

  },

]);

// Layer 2: On-demand toolkits

chef.registerToolkits([

  {

    name: "Weather",

    description: "Weather forecast APIs",

    tools: [

      /* ... */

    ],

  },

  {

    name: "Database",

    description: "SQL query and schema inspection",

    tools: [

      /* ... */

    ],

  },

]);

// Compile — tools: [file_ops, terminal, load_toolkit] (always stable)

const { tools, directoryXml } = chef.getPruner().compile();

// directoryXml: inject into system prompt so LLM knows available toolkits

```

**Agent Loop integration:**

```typescript

for (const toolCall of response.tool_calls) {

  if (chef.getPruner().isNamespaceCall(toolCall)) {

    // Route namespace call to real tool

    const { toolName, args } = chef.getPruner().resolveNamespace(toolCall);

    const result = await executeTool(toolName, args);

  } else if (chef.getPruner().isToolkitLoader(toolCall)) {

    // LLM requested a toolkit — expand and re-call

    const parsed = JSON.parse(toolCall.function.arguments);

    const newTools = chef.getPruner().extractToolkit(parsed.toolkit_name);

    // Merge newTools into the next LLM request

  }

}

```

---

### Memory

Persistent key-value memory that survives across sessions. Memory is modified via tool calls (`create_memory` / `modify_memory`), which are auto-injected into the payload on `compile()`.

```typescript

import { InMemoryStore, VFSMemoryStore } from "@context-chef/core";

const chef = new ContextChef({

  memory: {

    store: new InMemoryStore(), // ephemeral (testing)

    // store: new VFSMemoryStore(dir),   // persistent (production)

  },

});

// In your agent loop, intercept memory tool calls:

for (const toolCall of response.tool_calls) {

  if (toolCall.function.name === "create_memory") {

    const { key, value, description } = JSON.parse(toolCall.function.arguments);

    await chef.getMemory().createMemory(key, value, description);

  } else if (toolCall.function.name === "modify_memory") {

    const { action, key, value, description } = JSON.parse(toolCall.function.arguments);

    if (action === "update") {

      await chef.getMemory().updateMemory(key, value, description);

    } else {

      await chef.getMemory().deleteMemory(key);

    }

  }

}

// Direct read/write (developer use, bypasses validation hooks)

await chef.getMemory().set("persona", "You are a senior engineer", {

  description: "The agent's persona and role",

});

const value = await chef.getMemory().get("persona");

// On compile():

// - Memory tools (create_memory, modify_memory) are auto-injected into payload.tools

// - Existing memories are injected as  XML between systemPrompt and history

```

#### Memory placement — `memoryPlacement`

Controls where the volatile `` data block lands in the compiled payload. Defaults to `'after_system'` (backward compatible). For applications using **Anthropic prompt caching** with cache breakpoints on history, switch to `'before_history_tail'` so memory mutations don't invalidate the history cache.

```typescript

const chef = new ContextChef({

  memory: {

    store: new VFSMemoryStore(dir),

    memoryPlacement: 'before_history_tail',

  },

});

```

| Placement | Top of sandwich | Last user message | When to use |

|---|---|---|---|

| `'after_system'` (default) | INSTRUCTION + `` data, combined into one `role: 'system'` message | untouched | Simple agents; you don't rely on cache breakpoints past the system parameter |

| `'before_history_tail'` | INSTRUCTION only (stable, cacheable) | appends the `` data block to the original user content | You want cache breakpoints on history (or earlier `system` blocks) to survive memory mutations on every turn |

The split keeps the stable usage instruction at the top of the sandwich where it caches cleanly, and ships the volatile data block at the tail of the conversation. Anthropic / Gemini adapters extract every `role: 'system'` message into the top-level `system` parameter — under `'before_history_tail'` the data block stays in `messages` instead, so any cache breakpoint earlier in the message stream no longer hashes the changing memory text.

When dynamic state is also injected at the tail (`dynamicStatePlacement: 'last_user'`), the order inside the last user message is: original content → `` → `` → `` → anchor line. When dynamic state goes to its own system message (`dynamicStatePlacement: 'system'`), memory still injects at the user tail with no anchor.

---

### Skill (Behavior Bundle)

A `Skill` is a portable bundle of `(name + description + instructions + ...)` that scopes the agent's behavior for a specific phase or domain. Activating a skill injects its instructions as a dedicated system message between your system prompt and the memory block — no prompt rewriting on your side. Skills can be inline JS objects or loaded from `SKILL.md` files (same frontmatter shape as Claude Code / Mastra / OpenCode).

```typescript

import { ContextChef, type Skill } from "@context-chef/core";

const planning: Skill = {

  name: "planning",

  description: "Plan changes before editing",

  whenToUse: "When the task is non-trivial and requires multiple steps",

  instructions: "Read code, list affected files, write plan to scratchpad.",

  allowedTools: ["read_file", "grep"], // annotation only — chef does NOT enforce

};

const chef = new ContextChef();

chef.registerSkills([planning]);

chef.activateSkill("planning");

// activateSkill also accepts a Skill object directly, or null to deactivate.

const { messages, meta } = await chef.compile({ target: "openai" });

// messages = [...systemPrompt, { role: 'system', content: planning.instructions }, ...rest]

// meta.activeSkillName === 'planning'

```

#### Loading from `SKILL.md`

```typescript

import {

  loadSkill,

  loadSkillsDir,

  formatSkillListing,

} from "@context-chef/core";

// Load a single skill file

const skill = await loadSkill("./skills/db-debug/SKILL.md");

// Or scan a directory: each subdir/SKILL.md becomes a Skill (tolerant — bad files surface in `errors`)

const { skills, errors } = await loadSkillsDir("./skills");

chef.registerSkills(skills);

// Render a system-prompt-friendly listing (useful for LLM-driven `load_skill` tool)

const listing = formatSkillListing(skills, { format: "plain" });

```

The listing is typically used as the description of a `load_skill` tool, letting the LLM pick a skill itself:

```typescript

const loadSkillTool = {

  name: "load_skill",

  description:

    "Load a skill to specialize for the current task. Available:\n" + listing,

  parameters: {

    skill_name: {

      type: "string",

      enum: chef.getRegisteredSkills().map((s) => s.name),

    },

  },

};

// In your dispatch loop:

if (call.name === "load_skill") {

  chef.activateSkill(call.args.skill_name);

  /* push tool result, continue loop */

}

```

For the design rationale (Skill ⊥ Pruner decoupling, SKILL.md frontmatter shape, mode-wiring recipes, LLM-driven skill loading, reference files) see [`SKILL_SPEC.md`](./SKILL_SPEC.md).

---

### Snapshot & Restore

Capture and rollback full context state for branching or error recovery.

```typescript

const snap = chef.snapshot("before risky tool call");

// ... agent executes tool, something goes wrong ...

chef.restore(snap); // rolls back everything: history, dynamic state, janitor state, memory

```

---

### Lifecycle Events

Unified event system for observability across all internal modules. Subscribe via `chef.on()`, unsubscribe via `chef.off()`.

```typescript

// Log when history gets compressed

chef.on('compress', ({ summary, truncatedCount }) => {

  console.log(`Compressed ${truncatedCount} messages`);

});

// Track compile metrics

chef.on('compile:done', ({ payload }) => {

  metrics.track('compile', { messageCount: payload.messages.length });

});

// Monitor memory changes

chef.on('memory:changed', ({ type, key, value }) => {

  console.log(`Memory ${type}: ${key}`);

});

```

#### Available Events

| Event | Payload | Description |

|---|---|---|

| `compile:start` | `{ systemPrompt, history }` | Emitted at the start of `compile()` |

| `compile:done` | `{ payload }` | Emitted after `compile()` produces the final payload |

| `compress` | `{ summary, truncatedCount }` | Emitted after Janitor compresses history |

| `memory:changed` | `{ type, key, value, oldValue }` | Emitted after any memory mutation (set, delete, expire) |

| `memory:expired` | `MemoryEntry` | Emitted when a memory entry expires during `compile()` |

Events are **observation-only** — they don't affect control flow. Intercept hooks (`onBeforeCompress`, `onMemoryUpdate`, `onBeforeCompile`, `transformContext`) remain as config callbacks.

Events coexist with existing config callbacks: if you provide `onCompress` in `JanitorConfig`, it fires first, then the `compress` event is emitted.

#### Cancellation — `compile({ signal })`

Pass an `AbortSignal` to `compile()` to cancel an in-flight compile and propagate the signal to all event handlers fired during that call.

```typescript

const controller = new AbortController();

setTimeout(() => controller.abort(), 5000); // hard 5s budget

chef.on('compile:done', async ({ payload }, signal) => {

  // signal === controller.signal — forward it to slow async work

  await db.write(payload, { signal });

  await metrics.report(payload, { signal });

});

try {

  await chef.compile({ target: 'openai', signal: controller.signal });

} catch (err) {

  if (err instanceof DOMException && err.name === 'AbortError') {

    // compile was cancelled mid-flight (Janitor / onBeforeCompile / transformContext boundary)

  }

  throw err;

}

```

Two effects:

1. **Forwarded to handlers** — `chef.on(event, (payload, signal?) => ...)` receives the signal as the second argument. Handler can pass it to `fetch`, DB clients, or any cooperative API.

2. **Checked at compile() boundaries** — after Janitor compress, after `onBeforeCompile`, after `transformContext`. Aborts throw via `signal.throwIfAborted()`.

`compile:start` fires before the first abort check, so observers may receive a `compile:start` for a compile that ultimately throws without firing `compile:done`. Memory events fired from external `memory().set()` / `delete()` calls (outside `compile()`) get `signal: undefined`.

#### Concurrency Model

**Canonical pattern: one `ContextChef` instance per concurrent caller.** A chef holds mutable state across `await` points (in-flight signal, memory turn counter, active skill, history reference). Per-request instantiation gives each call its own state — no shared mutable state means no race.

```typescript

// Express / Fastify / Hono — one chef per request

app.post('/agent', async (req, res) => {

  const chef = new ContextChef({ memory: { store: sharedMemoryStore } });

  chef.setHistory(req.body.history);

  const payload = await chef.compile({ target: 'openai' });

  res.json(payload);

});

```

If memory needs to span requests, lift the store out (`VFSMemoryStore`, your own Redis-backed store) and pass it to per-request chefs — store-level concurrency is the store's responsibility, not the chef's.

**Sharing one chef across concurrent `compile()` calls is single-threaded by design.** Two `compile()` calls on the same instance clobber each other's `_currentSignal`, double-advance the memory turn counter, and interleave skill/history reads. Serialize per instance (`await chef.compile()` chained), or use the per-request pattern above. A snapshot+serialize defensive option is in the roadmap (TODO T2.4.1, low priority) but is not needed for canonical usage.

---

### `onBeforeCompile` Hook

Inject external context (RAG, AST snippets, MCP queries) right before compilation without modifying the message array.

```typescript

const chef = new ContextChef({

  onBeforeCompile: async (ctx) => {

    const snippets = await vectorDB.search(ctx.dynamicStateXml);

    return snippets.map((s) => s.content).join("\n");

    // Injected as ... alongside dynamic state

    // Return null to skip injection

  },

});

```

---

### Input Adapters (Provider → IR)

Convert OpenAI / Anthropic / Gemini native messages to ContextChef IR, automatically separating system and history. Each adapter sanitizes the result via `ensureValidHistory` at the boundary — orphan tool results are dropped, missing tool results get an `[No tool result available]` placeholder, and the first non-system message is forced to be a user message. IR you build manually with `chef.setHistory(...)` is NOT sanitized; trust the IR or call `ensureValidHistory(messages)` yourself.

```typescript

import { fromOpenAI, fromAnthropic, fromGemini } from "@context-chef/core";

// OpenAI

const { system, history } = fromOpenAI(openaiMessages);

chef.setSystemPrompt(system).setHistory(history);

// Anthropic (system is a separate top-level parameter)

const { system, history } = fromAnthropic(anthropicMessages, anthropicSystem);

chef.setSystemPrompt(system).setHistory(history);

// Gemini (systemInstruction is a separate top-level parameter)

const { system, history } = fromGemini(geminiContents, systemInstruction);

chef.setSystemPrompt(system).setHistory(history);

```

Multimodal content (images, files) is automatically converted to IR `attachments`:

| Provider Format | IR Field |

|---|---|

| OpenAI `image_url` / `file` | `attachments: [{ mediaType, data }]` |

| Anthropic `image` / `document` | `attachments: [{ mediaType, data }]` |

| Gemini `inlineData` / `fileData` | `attachments: [{ mediaType, data }]` |

`compile()` converts `attachments` back to the corresponding provider format. During compression, Janitor guides the compression model to describe image content.

---

### Target Adapters

| Feature                      | OpenAI                      | Anthropic                              | Gemini                                     |

| ---------------------------- | --------------------------- | -------------------------------------- | ------------------------------------------ |

| Format                       | Chat Completions            | Messages API                           | generateContent                            |

| Cache breakpoints            | Stripped                    | `cache_control: { type: 'ephemeral' }` | Stripped (uses separate CachedContent API) |

| Prefill (trailing assistant) | Degraded to `[System Note]` | Native support                         | Degraded to `[System Note]`                |

| `thinking` field             | Stripped                    | Mapped to `ThinkingBlockParam`         | Stripped                                   |

| Tool calls                   | `tool_calls` array          | `tool_use` blocks                      | `functionCall` parts                       |

| `attachments`                | `image_url` / `file` content parts | `image` / `document` blocks   | `inlineData` / `fileData` parts            |

Adapters are selected automatically by `compile({ target })`. You can also use them standalone:

```typescript

import { getAdapter } from "@context-chef/core";

const adapter = getAdapter("gemini");

const payload = adapter.compile(messages);

```

#### Custom adapters — `adapterRegistry` and `defaultTarget`

The three built-ins (`'openai' | 'anthropic' | 'gemini'`) are registered automatically. To plug in a third-party provider (Cohere, Mistral, an in-house protocol), implement `ITargetAdapter` and register it once:

```typescript

import { adapterRegistry, ITargetAdapter } from "@context-chef/core";

class CohereAdapter implements ITargetAdapter {

  compile(messages) {

    /* return Cohere-shaped payload */

  }

}

adapterRegistry.register("cohere", new CohereAdapter());

await chef.compile({ target: "cohere" }); // routed via the registry

```

`compile({ target })` accepts three forms:

| Form                  | Example                                | Use case                                      |

| --------------------- | -------------------------------------- | --------------------------------------------- |

| Built-in literal      | `compile({ target: "openai" })`        | Strict payload type via the type overloads    |

| Registered name       | `compile({ target: "cohere" })`        | Reuse the same custom adapter many times      |

| `ITargetAdapter`      | `compile({ target: new MyAdapter() })` | One-off use / tests — bypasses the registry   |

Set `defaultTarget` once in the constructor to avoid repeating it on every call:

```typescript

const chef = new ContextChef({ defaultTarget: "anthropic" });

await chef.compile(); // → AnthropicPayload

```

Resolution order in `compile()`:

`options.target` → `ChefConfig.defaultTarget` → `'openai'` (final built-in fallback).

For plugin systems and test isolation, pass a `sourceId` so a batch of registrations can be torn down together:

```typescript

adapterRegistry.register("cohere", new CohereAdapter(), "my-plugin");

adapterRegistry.register("mistral", new MistralAdapter(), "my-plugin");

// Later — unload the entire plugin in one call

adapterRegistry.unregisterBySource("my-plugin");

```

> **Replacing a built-in name** (e.g. `register('openai', myFork)`) keeps the strict overload's payload return type — `compile({ target: 'openai' })` is still typed `Promise`, so your replacement must honor that shape at runtime. TypeScript can't enforce this for you.

---

## Skills

ContextChef provides [Claude Code Skills](https://docs.anthropic.com/en/docs/claude-code/skills) that help you integrate the library into your project interactively. Each skill analyzes your existing codebase and generates tailored integration code.

| Skill | Description |

|---|---|

| `context-chef-core` | Integrate `@context-chef/core` — full control over compilation pipeline, multi-provider support |

| `context-chef-middleware` | Integrate `@context-chef/ai-sdk-middleware` — drop-in AI SDK middleware, zero code changes |

| `context-chef-tanstack` | Integrate `@context-chef/tanstack-ai` — TanStack AI ChatMiddleware with compression and state injection |

### Install

Install only what you need:

```bash

# Core library (OpenAI / Anthropic / Gemini direct SDK usage)

npx skills add MyPrototypeWhat/context-chef --skill context-chef-core

# AI SDK middleware (Vercel AI SDK v6+)

npx skills add MyPrototypeWhat/context-chef --skill context-chef-middleware

# TanStack AI middleware (TanStack AI v0.10+)

npx skills add MyPrototypeWhat/context-chef --skill context-chef-tanstack

# All

npx skills add MyPrototypeWhat/context-chef

```

### Use

Open [Claude Code](https://docs.anthropic.com/en/docs/claude-code/overview) in your project and type:

```

/context-chef-core

# or

/context-chef-middleware

```

Claude will:

1. **Detect your setup** — LLM SDK, package manager, TypeScript vs JavaScript

2. **Ask about your needs** — history compression, tool management, truncation, memory, etc.

3. **Generate integration code** — tailored to your project structure and existing agent loop

4. **Explain the architecture** — processing pipeline, cache breakpoints, dynamic state placement
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/myprototypewhat/context-chef

Awesome Lists containing this project

README