{"id":48344204,"url":"https://github.com/myprototypewhat/context-chef","last_synced_at":"2026-05-28T05:01:46.445Z","repository":{"id":339879682,"uuid":"1163447653","full_name":"MyPrototypeWhat/context-chef","owner":"MyPrototypeWhat","description":"Context compiler for TypeScript/JavaScript AI agents. Automatically compiles agent state into optimized LLM payloads with history compression, tool pruning, multi-provider support, and more.","archived":false,"fork":false,"pushed_at":"2026-04-05T04:40:22.000Z","size":34721,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-05T06:33:03.969Z","etag":null,"topics":["ai-agent","ai-sdk","anthropic","compression","context-engineering","context-window-optimization","gemini","llm","middleware","monorepo","openai","typescript","vercel-ai"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MyPrototypeWhat.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-21T16:44:21.000Z","updated_at":"2026-04-05T04:39:59.000Z","dependencies_parsed_at":"2026-02-22T10:00:32.126Z","dependency_job_id":null,"html_url":"https://github.com/MyPrototypeWhat/context-chef","commit_stats":null,"previous_names":["myprototypewhat/context-chef"],"tags_count":17,"template":false,"template_full_name":null,"purl":"pkg:github/MyPrototypeWhat/context-chef","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MyPrototypeWhat%2Fcontext-chef","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MyPrototypeWhat%2Fcontext-chef/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MyPrototypeWhat%2Fcontext-chef/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MyPrototypeWhat%2Fcontext-chef/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MyPrototypeWhat","download_url":"https://codeload.github.com/MyPrototypeWhat/context-chef/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MyPrototypeWhat%2Fcontext-chef/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31793225,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-14T02:24:21.117Z","status":"ssl_error","status_checked_at":"2026-04-14T02:24:20.627Z","response_time":153,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agent","ai-sdk","anthropic","compression","context-engineering","context-window-optimization","gemini","llm","middleware","monorepo","openai","typescript","vercel-ai"],"created_at":"2026-04-05T06:02:39.843Z","updated_at":"2026-05-28T05:01:46.436Z","avatar_url":"https://github.com/MyPrototypeWhat.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ContextChef\n\n[![npm version](https://img.shields.io/npm/v/@context-chef/core.svg)](https://www.npmjs.com/package/@context-chef/core)\n[![@context-chef/core Downloads](https://img.shields.io/npm/dm/@context-chef/core.svg?label=%40context-chef%2Fcore%20downloads)](https://www.npmjs.com/package/@context-chef/core)\n[![@context-chef/ai-sdk-middleware Downloads](https://img.shields.io/npm/dm/@context-chef/ai-sdk-middleware.svg?label=%40context-chef%2Fai-sdk-middleware%20downloads)](https://www.npmjs.com/package/@context-chef/ai-sdk-middleware)\n[![@context-chef/tanstack-ai Downloads](https://img.shields.io/npm/dm/@context-chef/tanstack-ai.svg?label=%40context-chef%2Ftanstack-ai%20downloads)](https://www.npmjs.com/package/@context-chef/tanstack-ai)\n[![License](https://img.shields.io/npm/l/@context-chef/core.svg)](https://github.com/MyPrototypeWhat/context-chef/blob/main/LICENSE)\n[![TypeScript](https://img.shields.io/badge/TypeScript-5.9-blue.svg)](https://www.typescriptlang.org/)\n[![CI](https://github.com/MyPrototypeWhat/context-chef/actions/workflows/ci.yml/badge.svg)](https://github.com/MyPrototypeWhat/context-chef/actions/workflows/ci.yml)\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"./ContextChef.gif\" alt=\"ContextChef Demo\" width=\"600\" /\u003e\n\u003c/p\u003e\n\nContext compiler for TypeScript/JavaScript AI agents.\n\nContextChef solves the most common context engineering problems in AI agent development: conversations too long for the model to remember, too many tools causing hallucinations, having to rewrite prompts when switching providers, and state drift in long-running tasks. It doesn't take over your control flow — it just compiles your state into an optimal payload before each LLM call.\n\n[中文文档](./README.zh-CN.md)\n\n## Packages\n\n| Package | Description |\n|---|---|\n| [`@context-chef/core`](./packages/core) | Core context compiler — history compression, tool pruning, memory, VFS offloading, multi-provider adapters |\n| [`@context-chef/ai-sdk-middleware`](./packages/ai-sdk-middleware) | [Vercel AI SDK](https://sdk.vercel.ai) middleware — drop-in context engineering with zero code changes |\n| [`@context-chef/tanstack-ai`](./packages/tanstack-ai) | [TanStack AI](https://tanstack.com/ai) middleware — compression, truncation, and dynamic state via `ChatMiddleware` |\n\n### Zero-config AI SDK integration\n\nIf you use the Vercel AI SDK, you can get transparent history compression and tool result truncation with just 2 lines:\n\n```typescript\nimport { withContextChef } from '@context-chef/ai-sdk-middleware';\nimport { openai } from '@ai-sdk/openai';\nimport { generateText } from 'ai';\n\nconst model = withContextChef(openai('gpt-4o'), {\n  contextWindow: 128_000,\n  compress: { model: openai('gpt-4o-mini') },\n  truncate: { threshold: 5000 },\n});\n\n// Everything below stays exactly the same\nconst result = await generateText({ model, messages, tools });\n```\n\nSee the [`@context-chef/ai-sdk-middleware` README](./packages/ai-sdk-middleware/README.md) for full documentation.\n\n### TanStack AI middleware\n\nIf you use TanStack AI, drop in the middleware for transparent context management:\n\n```typescript\nimport { contextChefMiddleware } from '@context-chef/tanstack-ai';\nimport { chat } from '@tanstack/ai';\nimport { openaiText } from '@tanstack/ai-openai';\n\nconst stream = chat({\n  adapter: openaiText('gpt-4o'),\n  messages,\n  middleware: [\n    contextChefMiddleware({\n      contextWindow: 128_000,\n      compress: { adapter: openaiText('gpt-4o-mini') },\n      truncate: { threshold: 5000 },\n    }),\n  ],\n});\n```\n\nSee the [`@context-chef/tanstack-ai` README](./packages/tanstack-ai/README.md) for full documentation.\n\n### Full control with `@context-chef/core`\n\nFor direct control over the compilation pipeline — dynamic state injection, tool namespaces, memory, snapshot/restore — use the core library directly:\n\n## Blog Series\n\n1. [Why \"Compile\" Your Context](https://myprototypewhat.cn/context-chef-1-why-compile-context-en)\n2. [Janitor — Separating Trigger Logic from Compression Policy](https://myprototypewhat.cn/context-chef-2-janitor-en)\n3. [Pruner — Decoupling Tool Registration from Routing](https://myprototypewhat.cn/context-chef-3-pruner-en)\n4. [Offloader/VFS — Relocate Information, Don't Destroy It](https://myprototypewhat.cn/context-chef-4-offloader-vfs-en)\n5. [Core Memory — Zero-Cost Reads, Structured Writes](https://myprototypewhat.cn/context-chef-5-core-memory-en)\n6. [Snapshot \u0026 Restore — Capture Everything That Determines the Next Compile](https://myprototypewhat.cn/context-chef-6-snapshot-en)\n7. [The Provider Adapter Layer — Let Differences Stop at Compile Time](https://myprototypewhat.cn/context-chef-7-adapters-en)\n8. [Five Extension Points in the Compile Pipeline](https://myprototypewhat.cn/context-chef-8-hooks-en)\n\n## Features\n\n- **Conversations too long?** — Automatically compress history, preserve recent memory, delegate old messages to a small model for summarization\n- **Too many tools?** — Dynamically prune the tool list per task, or use a two-layer architecture (stable namespaces + on-demand loading) to eliminate tool hallucinations\n- **Need to block tools at runtime?** — Pruner blocklist + `checkToolCall` gate for permission, environment safety, rate limits, and sandboxing — KV-cache preserving by default\n- **Mode-based behavior?** — `Skill` primitive bundles instructions and tool annotations per phase; loadable from `SKILL.md` files (compatible with Claude Code / Mastra / OpenCode formats)\n- **Switching providers?** — Same prompt architecture compiles to OpenAI / Anthropic / Gemini with automatic prefill, cache, and tool call format adaptation\n- **Long tasks drifting?** — Zod schema-based state injection forces the model to stay aligned with the current task on every call\n- **Terminal output too large?** — Auto-truncate and offload to VFS, keeping error lines + a `context://` URI pointer for on-demand retrieval\n- **Can't remember across sessions?** — Memory lets the model persist key information (project rules, user preferences) via tool calls, auto-injected on the next session\n- **Need to rollback?** — Snapshot \u0026 Restore captures and rolls back full context state for branching and exploration\n- **Need external context?** — `onBeforeCompile` hook lets you inject RAG results, AST snippets, or MCP queries before compilation\n- **Need observability?** — Unified event system (`chef.on('compress', ...)`) for logging, metrics, and debugging across all internal modules\n\n## Installation\n\n```bash\nnpm install @context-chef/core zod\n```\n\n## Quick Start\n\n```typescript\nimport { ContextChef } from \"@context-chef/core\";\nimport { z } from \"zod\";\n\nconst TaskSchema = z.object({\n  activeFile: z.string(),\n  todo: z.array(z.string()),\n});\n\nconst chef = new ContextChef({\n  janitor: {\n    contextWindow: 200000,\n    compressionModel: async (msgs) =\u003e callGpt4oMini(msgs),\n  },\n});\n\nconst payload = await chef\n  .setSystemPrompt([\n    {\n      role: \"system\",\n      content: \"You are an expert coder.\",\n      _cache_breakpoint: true,\n    },\n  ])\n  .setHistory(conversationHistory)\n  .setDynamicState(TaskSchema, {\n    activeFile: \"auth.ts\",\n    todo: [\"Fix login bug\"],\n  })\n  .withGuardrails({\n    enforceXML: { outputTag: \"response\" },\n    prefill: \"\u003cthinking\u003e\\n1.\",\n  })\n  .compile({ target: \"anthropic\" });\n\nconst response = await anthropic.messages.create(payload);\n```\n\n---\n\n## API Reference\n\n### `new ContextChef(config?)`\n\n```typescript\nconst chef = new ContextChef({\n  vfs?: { threshold?: number, storageDir?: string, maxAge?: number, maxFiles?: number, maxBytes?: number, onVFSEvicted?: (entry, reason) =\u003e void },\n  janitor?: JanitorConfig,\n  pruner?: { strategy?: 'union' | 'intersection' },\n  memory?: MemoryConfig,\n  transformContext?: (messages: Message[]) =\u003e Message[] | Promise\u003cMessage[]\u003e,\n  onBeforeCompile?: (context: BeforeCompileContext) =\u003e string | null | Promise\u003cstring | null\u003e,\n});\n```\n\n### Context Building\n\n#### `chef.setSystemPrompt(messages): this`\n\nSets the static system prompt layer. Cached prefix — should rarely change.\n\n```typescript\nchef.setSystemPrompt([\n  {\n    role: \"system\",\n    content: \"You are an expert coder.\",\n    _cache_breakpoint: true,\n  },\n]);\n```\n\n`_cache_breakpoint: true` tells the Anthropic adapter to inject `cache_control: { type: 'ephemeral' }`.\n\n#### `chef.setHistory(messages): this`\n\nSets the conversation history. Janitor compresses automatically on `compile()`.\n\n#### `chef.setDynamicState(schema, data, options?): this`\n\nInjects Zod-validated state as XML into the context.\n\n```typescript\nconst TaskSchema = z.object({\n  activeFile: z.string(),\n  todo: z.array(z.string()),\n});\n\nchef.setDynamicState(TaskSchema, { activeFile: \"auth.ts\", todo: [\"Fix bug\"] });\n// placement defaults to 'last_user' (injected into the last user message)\n// use { placement: 'system' } for a standalone system message\n```\n\n#### `chef.withGuardrails(options): this`\n\nApplies output format guardrails and optional prefill.\n\n```typescript\nchef.withGuardrails({\n  enforceXML: { outputTag: \"final_code\" }, // wraps output rules in EPHEMERAL_MESSAGE\n  prefill: \"\u003cthinking\u003e\\n1.\", // trailing assistant message (auto-degraded for OpenAI/Gemini)\n});\n```\n\n#### `chef.compile(options?): Promise\u003cTargetPayload\u003e`\n\nCompiles everything into a provider-ready payload. Triggers Janitor compression. Registered tools are auto-included.\n\n```typescript\nconst payload = await chef.compile({ target: \"openai\" }); // OpenAIPayload\nconst payload = await chef.compile({ target: \"anthropic\" }); // AnthropicPayload\nconst payload = await chef.compile({ target: \"gemini\" }); // GeminiPayload\n```\n\n---\n\n### History Compression (Janitor)\n\nJanitor provides two compression paths. Choose the one that fits your setup:\n\n#### Path 1: Tokenizer (precise control)\n\nProvide your own token counting function for precise per-message calculation. Janitor preserves recent messages that fit within `contextWindow × preserveRatio` and compresses the rest.\n\n```typescript\nconst chef = new ContextChef({\n  janitor: {\n    contextWindow: 200000,\n    tokenizer: (msgs) =\u003e\n      msgs.reduce((sum, m) =\u003e sum + encode(m.content).length, 0),\n    preserveRatio: 0.8, // keep 80% of contextWindow for recent messages (default)\n    compressionModel: async (msgs) =\u003e callGpt4oMini(msgs),\n    onCompress: async (summary, count) =\u003e {\n      await db.saveCompression(sessionId, summary, count);\n    },\n  },\n});\n```\n\n#### Path 2: reportTokenUsage (simple, no tokenizer needed)\n\nMost LLM APIs return token usage in their response. Feed that value back — when it exceeds `contextWindow`, Janitor compresses everything except the last N messages.\n\n```typescript\nconst chef = new ContextChef({\n  janitor: {\n    contextWindow: 200000,\n    preserveRecentMessages: 1,       // keep last 1 message on compression (default)\n    compressionModel: async (msgs) =\u003e callGpt4oMini(msgs),\n  },\n});\n\n// After each LLM call:\nconst response = await openai.chat.completions.create({ ... });\nchef.reportTokenUsage(response.usage.prompt_tokens);\n```\n\n\u003e **Note:** Without a `compressionModel`, old messages are discarded with no summary. A console warning is printed at construction time if neither `tokenizer` nor `compressionModel` is provided.\n\n#### `JanitorConfig`\n\n| Option                          | Type                                        | Default    | Description                                                                                  |\n| ------------------------------- | ------------------------------------------- | ---------- | -------------------------------------------------------------------------------------------- |\n| `contextWindow`                 | `number`                                    | _required_ | Model's context window size (tokens). Compression triggers when usage exceeds this.          |\n| `tokenizer`                     | `(msgs: Message[]) =\u003e number`               | —          | Enables the tokenizer path for precise per-message token calculation.                        |\n| `preserveRatio`                 | `number`                                    | `0.8`      | [Tokenizer path] Ratio of `contextWindow` to preserve for recent messages.                   |\n| `preserveRecentMessages`        | `number`                                    | `1`        | [reportTokenUsage path] Number of recent turns to keep when compressing.                     |\n| `usagePreference`               | `'max' \\| 'feedFirst' \\| 'tokenizerFirst'`  | `'max'`    | Which token source drives the trigger when both `tokenizer` and `reportTokenUsage` are set. Without `tokenizer`, the value union narrows to `'max' \\| 'feedFirst'` — TypeScript rejects `'tokenizerFirst'` at compile time. See the [core package README](./packages/core) for the full breakdown. |\n| `compressionModel`              | `(msgs: Message[]) =\u003e Promise\u003cstring\u003e`      | —          | Async hook to summarize old messages via a low-cost LLM.                                     |\n| `customCompressionInstructions` | `string`                                    | —          | Additional focused instructions appended to the default compression prompt (additive, not replacement). |\n| `onCompress`                    | `(summary, count) =\u003e void`                  | —          | Fires after compression with the summary message and truncated count.                        |\n| `onBeforeCompress`              | `(history, tokenInfo) =\u003e Message[] \\| null` | —          | Fires before LLM compression. Return modified history to intervene, or null to proceed normally. |\n\n**Compression output contract.** Janitor's default prompt instructs the compression model to produce an `\u003canalysis\u003e` scratchpad (stripped from the final output) followed by a structured `\u003csummary\u003e` block with 5 domain-agnostic sections (Task Overview / Current State / Important Discoveries / Next Steps / Context to Preserve). Raw output is piped through `Prompts.formatCompactSummary` before injection. See the [core package README](./packages/core) for the full contract and `customCompressionInstructions` usage.\n\n**Circuit breaker.** If `compressionModel` throws three times in a row, `compress()` becomes a no-op until the next successful compression or an explicit `janitor.reset()` / `chef.clearHistory()`. The failure counter is preserved by `chef.snapshot()` / `chef.restore()`.\n\n#### `chef.reportTokenUsage(tokenCount): this`\n\nFeed the API-reported token count. On the next `compile()`, if this value exceeds `contextWindow`, compression is triggered. In the tokenizer path, the default is to take the higher of the local calculation and the fed value; switch via `usagePreference` if you want `'feedFirst'` (trust the API truth) or `'tokenizerFirst'` (ignore fed entirely).\n\n```typescript\nconst response = await openai.chat.completions.create({ ... });\nchef.reportTokenUsage(response.usage.prompt_tokens);\n```\n\n#### `onBeforeCompress` hook\n\nFires when the token budget is exceeded, **before** LLM compression. Return a modified `Message[]` to replace the history, or return `null` to let default compression proceed.\n\n```typescript\nconst chef = new ContextChef({\n  janitor: {\n    contextWindow: 200000,\n    tokenizer: (msgs) =\u003e countTokens(msgs),\n    onBeforeCompress: (history, { currentTokens, limit }) =\u003e {\n      // Example: offload large tool results to VFS before compression\n      return history.map((msg) =\u003e\n        msg.role === \"tool\" \u0026\u0026 msg.content.length \u003e 5000\n          ? { ...msg, content: pointer.offload(msg.content).content }\n          : msg,\n      );\n    },\n  },\n});\n```\n\n#### Mechanical Compaction (`compact`)\n\nStrip content from history at zero LLM cost. Use proactively in your agent loop to keep context lean.\n\n```typescript\n// Clear all tool results and thinking blocks\nhistory = janitor.compact(history, { clear: ['tool-result', 'thinking'] });\n\n// Keep the 5 most recent tool results, clear the rest (min: 1)\nhistory = janitor.compact(history, {\n  clear: [{ target: 'tool-result', keepRecent: 5 }],\n});\n\n// Combine: clear old tool results + all thinking\nhistory = janitor.compact(history, {\n  clear: [{ target: 'tool-result', keepRecent: 5 }, 'thinking'],\n});\n```\n\n#### `ensureValidHistory(history)`\n\nStandalone utility that sanitizes message history to satisfy LLM API invariants (orphan tool result removal, missing tool result placeholder injection, first-non-system-must-be-user). Use when loading history from a database or after manual modifications.\n\n```typescript\nimport { ensureValidHistory } from '@context-chef/core';\n\nconst safeHistory = ensureValidHistory(rawHistory);\nchef.setHistory(safeHistory);\n```\n\n\u003e **Boundary contract.** All input adapters (`fromOpenAI` / `fromAnthropic` / `fromGemini`, plus middleware-internal `fromAISDK` / `fromTanStackAI`) run their output through `ensureValidHistory` automatically — they're the system boundary between external SDK formats and ContextChef IR. `chef.setHistory(IR)` does NOT sanitize: IR is treated as an internal protocol, and history you construct (or mutate) directly is trusted to satisfy the invariants. Wrap with `ensureValidHistory(...)` explicitly when in doubt.\n\n#### `chef.clearHistory(): this`\n\nExplicitly clear history and reset Janitor state when switching topics or completing sub-tasks.\n\n---\n\n### Large Output Offloading (Offloader / VFS)\n\n```typescript\n// Offload if content exceeds threshold; preserves last 2000 chars by default\nconst safeLog = chef.offload(rawTerminalOutput);\nhistory.push({ role: \"tool\", content: safeLog, tool_call_id: \"call_123\" });\n// safeLog: original content if small, or truncated with context://vfs/ URI\n\n// Preserve head (first 500 chars) + tail (last 1000 chars), snapped to line boundaries\nconst safeOutput = chef.offload(content, { headChars: 500, tailChars: 1000 });\n\n// No preview content — just truncation notice + URI\nconst safeDoc = chef.offload(largeFileContent, { headChars: 0, tailChars: 0 });\n\n// Override threshold per call\nconst safeOutput2 = chef.offload(content, { threshold: 2000, tailChars: 500 });\n```\n\nRegister a tool for the LLM to read full content when needed:\n\n```typescript\n// In your tool handler:\nimport { Offloader } from \"@context-chef/core\";\nconst offloader = new Offloader({ storageDir: \".context_vfs\" });\nconst fullContent = offloader.resolve(uri);\n```\n\n#### Cleanup \u0026 Lifecycle\n\n`.context_vfs/` grows unboundedly without intervention. Configure caps and trigger cleanup yourself — never automatic.\n\n```typescript\nconst chef = new ContextChef({\n  vfs: {\n    threshold: 5000,\n    maxAge: 24 * 60 * 60 * 1000, // ms since createdAt\n    maxFiles: 200,                // LRU evict by accessedAt\n    maxBytes: 50 * 1024 * 1024,   // true UTF-8 size (Buffer.byteLength)\n    onVFSEvicted: (entry, reason) =\u003e {\n      // 'maxAge' | 'maxFiles' | 'maxBytes' — errors logged and swallowed\n      logger.debug(\"evicted\", entry.uri, reason);\n    },\n  },\n});\n\n// Manual sweep — call from your agent loop, on session end, or wire to compile:done.\nconst result = await chef.getOffloader().cleanupAsync();\n// { evicted, evictedBytes, evictedByAge, evictedByCount, evictedByBytes, failed }\n\n// Override caps for one call (Infinity disables a single cap).\nawait chef.getOffloader().cleanupAsync({ maxFiles: 0 }); // evict all over-age + all\n```\n\nAfter a process restart, `reconcile()` walks the adapter and adopts orphan files into the in-memory index so subsequent `cleanup()` can see them:\n\n```typescript\nconst adopted = await chef.getOffloader().reconcileAsync({ measureBytes: true });\n// createdAt parsed from filename (vfs_\u003cts\u003e_\u003chash\u003e.txt); bytes measured if requested.\n```\n\nCleanup is **mechanism, not policy** — it is never triggered by `compile()`. Wire it to `compile:done` for per-turn enforcement, or call it on a timer / on session end. Custom `VFSStorageAdapter` implementations must add optional `list()` / `delete()` methods to enable cleanup; if either is missing, `cleanup()` throws `VFSCleanupNotSupportedError` (the built-in `FileSystemAdapter` implements both).\n\n\u003e **Production patterns** — see [`docs/vfs-lifecycle-recipes.md`](./docs/vfs-lifecycle-recipes.md) for runnable recipes covering long-running servers, serverless cold-start `reconcile()`, AI SDK middleware integration, custom storage adapters (Redis example), and choosing your eviction strategy.\n\n---\n\n### Tool Management (Pruner)\n\n#### Flat Mode\n\n```typescript\nchef.registerTools([\n  { name: \"read_file\", description: \"Read a file\", tags: [\"file\", \"read\"] },\n  { name: \"run_bash\", description: \"Run a command\", tags: [\"shell\"] },\n  {\n    name: \"get_time\",\n    description: \"Get timestamp\" /* no tags = always kept */,\n  },\n]);\n\nconst { tools, removed } = chef.getPruner().pruneByTask(\"Read the auth.ts file\");\n// tools: [read_file, get_time]\n```\n\nAlso supports `allowOnly(names)` and `pruneByTaskAndAllowlist(task, names)`.\n\n#### Runtime Blocklist (Permission Gate)\n\nBlock specific tools at dispatch time without breaking KV cache. Useful for permission control, environment safety, sandboxing, rate limits, and feature flags. The compiled `tools` array stays unchanged — enforcement happens via `checkToolCall` in your agent loop.\n\n```typescript\n// Set policy (rare event — startup, on user role change, prod env, etc.)\nchef.getPruner().setBlockedTools([\"delete_file\", \"tail_logs\"]);\n\n// In your agent loop, gate every tool call before dispatch:\nfor (const call of response.tool_calls) {\n  const check = chef.checkToolCall(call);\n  if (!check.allowed) {\n    history.push({\n      role: \"tool\",\n      tool_call_id: call.id,\n      content: check.reason, // e.g. 'Tool \"delete_file\" is currently blocked.'\n    });\n    continue;\n  }\n  await executeTool(call);\n}\n```\n\n`checkToolCall` returns a discriminated union (`ToolCallCheckResult`), so TypeScript guarantees `reason` is present iff the call is rejected. KV cache is preserved across blocklist changes — the LLM continues to see the full tool set; the gate is dispatch-side only.\n\n#### Namespace + Lazy Loading (Two-Layer Architecture)\n\n**Layer 1 — Namespaces**: Core tools grouped into stable tool definitions. The tool list never changes across turns.\n\n**Layer 2 — Lazy Loading**: Long-tail tools registered as a lightweight XML directory. The LLM loads full schemas on demand via `load_toolkit`.\n\n```typescript\n// Layer 1: Stable namespace tools\nchef.registerNamespaces([\n  {\n    name: \"file_ops\",\n    description: \"File system operations\",\n    tools: [\n      {\n        name: \"read_file\",\n        description: \"Read a file\",\n        parameters: { path: { type: \"string\" } },\n      },\n      {\n        name: \"write_file\",\n        description: \"Write to a file\",\n        parameters: { path: { type: \"string\" }, content: { type: \"string\" } },\n      },\n    ],\n  },\n  {\n    name: \"terminal\",\n    description: \"Shell command execution\",\n    tools: [\n      {\n        name: \"run_bash\",\n        description: \"Execute a command\",\n        parameters: { command: { type: \"string\" } },\n      },\n    ],\n  },\n]);\n\n// Layer 2: On-demand toolkits\nchef.registerToolkits([\n  {\n    name: \"Weather\",\n    description: \"Weather forecast APIs\",\n    tools: [\n      /* ... */\n    ],\n  },\n  {\n    name: \"Database\",\n    description: \"SQL query and schema inspection\",\n    tools: [\n      /* ... */\n    ],\n  },\n]);\n\n// Compile — tools: [file_ops, terminal, load_toolkit] (always stable)\nconst { tools, directoryXml } = chef.getPruner().compile();\n// directoryXml: inject into system prompt so LLM knows available toolkits\n```\n\n**Agent Loop integration:**\n\n```typescript\nfor (const toolCall of response.tool_calls) {\n  if (chef.getPruner().isNamespaceCall(toolCall)) {\n    // Route namespace call to real tool\n    const { toolName, args } = chef.getPruner().resolveNamespace(toolCall);\n    const result = await executeTool(toolName, args);\n  } else if (chef.getPruner().isToolkitLoader(toolCall)) {\n    // LLM requested a toolkit — expand and re-call\n    const parsed = JSON.parse(toolCall.function.arguments);\n    const newTools = chef.getPruner().extractToolkit(parsed.toolkit_name);\n    // Merge newTools into the next LLM request\n  }\n}\n```\n\n---\n\n### Memory\n\nPersistent key-value memory that survives across sessions. Memory is modified via tool calls (`create_memory` / `modify_memory`), which are auto-injected into the payload on `compile()`.\n\n```typescript\nimport { InMemoryStore, VFSMemoryStore } from \"@context-chef/core\";\n\nconst chef = new ContextChef({\n  memory: {\n    store: new InMemoryStore(), // ephemeral (testing)\n    // store: new VFSMemoryStore(dir),   // persistent (production)\n  },\n});\n\n// In your agent loop, intercept memory tool calls:\nfor (const toolCall of response.tool_calls) {\n  if (toolCall.function.name === \"create_memory\") {\n    const { key, value, description } = JSON.parse(toolCall.function.arguments);\n    await chef.getMemory().createMemory(key, value, description);\n  } else if (toolCall.function.name === \"modify_memory\") {\n    const { action, key, value, description } = JSON.parse(toolCall.function.arguments);\n    if (action === \"update\") {\n      await chef.getMemory().updateMemory(key, value, description);\n    } else {\n      await chef.getMemory().deleteMemory(key);\n    }\n  }\n}\n\n// Direct read/write (developer use, bypasses validation hooks)\nawait chef.getMemory().set(\"persona\", \"You are a senior engineer\", {\n  description: \"The agent's persona and role\",\n});\nconst value = await chef.getMemory().get(\"persona\");\n\n// On compile():\n// - Memory tools (create_memory, modify_memory) are auto-injected into payload.tools\n// - Existing memories are injected as \u003cmemory\u003e XML between systemPrompt and history\n```\n\n#### Memory placement — `memoryPlacement`\n\nControls where the volatile `\u003cmemory\u003e` data block lands in the compiled payload. Defaults to `'after_system'` (backward compatible). For applications using **Anthropic prompt caching** with cache breakpoints on history, switch to `'before_history_tail'` so memory mutations don't invalidate the history cache.\n\n```typescript\nconst chef = new ContextChef({\n  memory: {\n    store: new VFSMemoryStore(dir),\n    memoryPlacement: 'before_history_tail',\n  },\n});\n```\n\n| Placement | Top of sandwich | Last user message | When to use |\n|---|---|---|---|\n| `'after_system'` (default) | INSTRUCTION + `\u003cmemory\u003e` data, combined into one `role: 'system'` message | untouched | Simple agents; you don't rely on cache breakpoints past the system parameter |\n| `'before_history_tail'` | INSTRUCTION only (stable, cacheable) | appends the `\u003cmemory\u003e` data block to the original user content | You want cache breakpoints on history (or earlier `system` blocks) to survive memory mutations on every turn |\n\nThe split keeps the stable usage instruction at the top of the sandwich where it caches cleanly, and ships the volatile data block at the tail of the conversation. Anthropic / Gemini adapters extract every `role: 'system'` message into the top-level `system` parameter — under `'before_history_tail'` the data block stays in `messages` instead, so any cache breakpoint earlier in the message stream no longer hashes the changing memory text.\n\nWhen dynamic state is also injected at the tail (`dynamicStatePlacement: 'last_user'`), the order inside the last user message is: original content → `\u003cmemory\u003e` → `\u003cdynamic_state\u003e` → `\u003cimplicit_context\u003e` → anchor line. When dynamic state goes to its own system message (`dynamicStatePlacement: 'system'`), memory still injects at the user tail with no anchor.\n\n---\n\n### Skill (Behavior Bundle)\n\nA `Skill` is a portable bundle of `(name + description + instructions + ...)` that scopes the agent's behavior for a specific phase or domain. Activating a skill injects its instructions as a dedicated system message between your system prompt and the memory block — no prompt rewriting on your side. Skills can be inline JS objects or loaded from `SKILL.md` files (same frontmatter shape as Claude Code / Mastra / OpenCode).\n\n```typescript\nimport { ContextChef, type Skill } from \"@context-chef/core\";\n\nconst planning: Skill = {\n  name: \"planning\",\n  description: \"Plan changes before editing\",\n  whenToUse: \"When the task is non-trivial and requires multiple steps\",\n  instructions: \"Read code, list affected files, write plan to scratchpad.\",\n  allowedTools: [\"read_file\", \"grep\"], // annotation only — chef does NOT enforce\n};\n\nconst chef = new ContextChef();\nchef.registerSkills([planning]);\nchef.activateSkill(\"planning\");\n// activateSkill also accepts a Skill object directly, or null to deactivate.\n\nconst { messages, meta } = await chef.compile({ target: \"openai\" });\n// messages = [...systemPrompt, { role: 'system', content: planning.instructions }, ...rest]\n// meta.activeSkillName === 'planning'\n```\n\n#### Loading from `SKILL.md`\n\n```typescript\nimport {\n  loadSkill,\n  loadSkillsDir,\n  formatSkillListing,\n} from \"@context-chef/core\";\n\n// Load a single skill file\nconst skill = await loadSkill(\"./skills/db-debug/SKILL.md\");\n\n// Or scan a directory: each subdir/SKILL.md becomes a Skill (tolerant — bad files surface in `errors`)\nconst { skills, errors } = await loadSkillsDir(\"./skills\");\nchef.registerSkills(skills);\n\n// Render a system-prompt-friendly listing (useful for LLM-driven `load_skill` tool)\nconst listing = formatSkillListing(skills, { format: \"plain\" });\n```\n\nThe listing is typically used as the description of a `load_skill` tool, letting the LLM pick a skill itself:\n\n```typescript\nconst loadSkillTool = {\n  name: \"load_skill\",\n  description:\n    \"Load a skill to specialize for the current task. Available:\\n\" + listing,\n  parameters: {\n    skill_name: {\n      type: \"string\",\n      enum: chef.getRegisteredSkills().map((s) =\u003e s.name),\n    },\n  },\n};\n\n// In your dispatch loop:\nif (call.name === \"load_skill\") {\n  chef.activateSkill(call.args.skill_name);\n  /* push tool result, continue loop */\n}\n```\n\nFor the design rationale (Skill ⊥ Pruner decoupling, SKILL.md frontmatter shape, mode-wiring recipes, LLM-driven skill loading, reference files) see [`SKILL_SPEC.md`](./SKILL_SPEC.md).\n\n---\n\n### Snapshot \u0026 Restore\n\nCapture and rollback full context state for branching or error recovery.\n\n```typescript\nconst snap = chef.snapshot(\"before risky tool call\");\n\n// ... agent executes tool, something goes wrong ...\n\nchef.restore(snap); // rolls back everything: history, dynamic state, janitor state, memory\n```\n\n---\n\n### Lifecycle Events\n\nUnified event system for observability across all internal modules. Subscribe via `chef.on()`, unsubscribe via `chef.off()`.\n\n```typescript\n// Log when history gets compressed\nchef.on('compress', ({ summary, truncatedCount }) =\u003e {\n  console.log(`Compressed ${truncatedCount} messages`);\n});\n\n// Track compile metrics\nchef.on('compile:done', ({ payload }) =\u003e {\n  metrics.track('compile', { messageCount: payload.messages.length });\n});\n\n// Monitor memory changes\nchef.on('memory:changed', ({ type, key, value }) =\u003e {\n  console.log(`Memory ${type}: ${key}`);\n});\n```\n\n#### Available Events\n\n| Event | Payload | Description |\n|---|---|---|\n| `compile:start` | `{ systemPrompt, history }` | Emitted at the start of `compile()` |\n| `compile:done` | `{ payload }` | Emitted after `compile()` produces the final payload |\n| `compress` | `{ summary, truncatedCount }` | Emitted after Janitor compresses history |\n| `memory:changed` | `{ type, key, value, oldValue }` | Emitted after any memory mutation (set, delete, expire) |\n| `memory:expired` | `MemoryEntry` | Emitted when a memory entry expires during `compile()` |\n\nEvents are **observation-only** — they don't affect control flow. Intercept hooks (`onBeforeCompress`, `onMemoryUpdate`, `onBeforeCompile`, `transformContext`) remain as config callbacks.\n\nEvents coexist with existing config callbacks: if you provide `onCompress` in `JanitorConfig`, it fires first, then the `compress` event is emitted.\n\n#### Cancellation — `compile({ signal })`\n\nPass an `AbortSignal` to `compile()` to cancel an in-flight compile and propagate the signal to all event handlers fired during that call.\n\n```typescript\nconst controller = new AbortController();\nsetTimeout(() =\u003e controller.abort(), 5000); // hard 5s budget\n\nchef.on('compile:done', async ({ payload }, signal) =\u003e {\n  // signal === controller.signal — forward it to slow async work\n  await db.write(payload, { signal });\n  await metrics.report(payload, { signal });\n});\n\ntry {\n  await chef.compile({ target: 'openai', signal: controller.signal });\n} catch (err) {\n  if (err instanceof DOMException \u0026\u0026 err.name === 'AbortError') {\n    // compile was cancelled mid-flight (Janitor / onBeforeCompile / transformContext boundary)\n  }\n  throw err;\n}\n```\n\nTwo effects:\n\n1. **Forwarded to handlers** — `chef.on(event, (payload, signal?) =\u003e ...)` receives the signal as the second argument. Handler can pass it to `fetch`, DB clients, or any cooperative API.\n2. **Checked at compile() boundaries** — after Janitor compress, after `onBeforeCompile`, after `transformContext`. Aborts throw via `signal.throwIfAborted()`.\n\n`compile:start` fires before the first abort check, so observers may receive a `compile:start` for a compile that ultimately throws without firing `compile:done`. Memory events fired from external `memory().set()` / `delete()` calls (outside `compile()`) get `signal: undefined`.\n\n#### Concurrency Model\n\n**Canonical pattern: one `ContextChef` instance per concurrent caller.** A chef holds mutable state across `await` points (in-flight signal, memory turn counter, active skill, history reference). Per-request instantiation gives each call its own state — no shared mutable state means no race.\n\n```typescript\n// Express / Fastify / Hono — one chef per request\napp.post('/agent', async (req, res) =\u003e {\n  const chef = new ContextChef({ memory: { store: sharedMemoryStore } });\n  chef.setHistory(req.body.history);\n  const payload = await chef.compile({ target: 'openai' });\n  res.json(payload);\n});\n```\n\nIf memory needs to span requests, lift the store out (`VFSMemoryStore`, your own Redis-backed store) and pass it to per-request chefs — store-level concurrency is the store's responsibility, not the chef's.\n\n**Sharing one chef across concurrent `compile()` calls is single-threaded by design.** Two `compile()` calls on the same instance clobber each other's `_currentSignal`, double-advance the memory turn counter, and interleave skill/history reads. Serialize per instance (`await chef.compile()` chained), or use the per-request pattern above. A snapshot+serialize defensive option is in the roadmap (TODO T2.4.1, low priority) but is not needed for canonical usage.\n\n---\n\n### `onBeforeCompile` Hook\n\nInject external context (RAG, AST snippets, MCP queries) right before compilation without modifying the message array.\n\n```typescript\nconst chef = new ContextChef({\n  onBeforeCompile: async (ctx) =\u003e {\n    const snippets = await vectorDB.search(ctx.dynamicStateXml);\n    return snippets.map((s) =\u003e s.content).join(\"\\n\");\n    // Injected as \u003cimplicit_context\u003e...\u003c/implicit_context\u003e alongside dynamic state\n    // Return null to skip injection\n  },\n});\n```\n\n---\n\n### Input Adapters (Provider → IR)\n\nConvert OpenAI / Anthropic / Gemini native messages to ContextChef IR, automatically separating system and history. Each adapter sanitizes the result via `ensureValidHistory` at the boundary — orphan tool results are dropped, missing tool results get an `[No tool result available]` placeholder, and the first non-system message is forced to be a user message. IR you build manually with `chef.setHistory(...)` is NOT sanitized; trust the IR or call `ensureValidHistory(messages)` yourself.\n\n```typescript\nimport { fromOpenAI, fromAnthropic, fromGemini } from \"@context-chef/core\";\n\n// OpenAI\nconst { system, history } = fromOpenAI(openaiMessages);\nchef.setSystemPrompt(system).setHistory(history);\n\n// Anthropic (system is a separate top-level parameter)\nconst { system, history } = fromAnthropic(anthropicMessages, anthropicSystem);\nchef.setSystemPrompt(system).setHistory(history);\n\n// Gemini (systemInstruction is a separate top-level parameter)\nconst { system, history } = fromGemini(geminiContents, systemInstruction);\nchef.setSystemPrompt(system).setHistory(history);\n```\n\nMultimodal content (images, files) is automatically converted to IR `attachments`:\n\n| Provider Format | IR Field |\n|---|---|\n| OpenAI `image_url` / `file` | `attachments: [{ mediaType, data }]` |\n| Anthropic `image` / `document` | `attachments: [{ mediaType, data }]` |\n| Gemini `inlineData` / `fileData` | `attachments: [{ mediaType, data }]` |\n\n`compile()` converts `attachments` back to the corresponding provider format. During compression, Janitor guides the compression model to describe image content.\n\n---\n\n### Target Adapters\n\n| Feature                      | OpenAI                      | Anthropic                              | Gemini                                     |\n| ---------------------------- | --------------------------- | -------------------------------------- | ------------------------------------------ |\n| Format                       | Chat Completions            | Messages API                           | generateContent                            |\n| Cache breakpoints            | Stripped                    | `cache_control: { type: 'ephemeral' }` | Stripped (uses separate CachedContent API) |\n| Prefill (trailing assistant) | Degraded to `[System Note]` | Native support                         | Degraded to `[System Note]`                |\n| `thinking` field             | Stripped                    | Mapped to `ThinkingBlockParam`         | Stripped                                   |\n| Tool calls                   | `tool_calls` array          | `tool_use` blocks                      | `functionCall` parts                       |\n| `attachments`                | `image_url` / `file` content parts | `image` / `document` blocks   | `inlineData` / `fileData` parts            |\n\nAdapters are selected automatically by `compile({ target })`. You can also use them standalone:\n\n```typescript\nimport { getAdapter } from \"@context-chef/core\";\nconst adapter = getAdapter(\"gemini\");\nconst payload = adapter.compile(messages);\n```\n\n#### Custom adapters — `adapterRegistry` and `defaultTarget`\n\nThe three built-ins (`'openai' | 'anthropic' | 'gemini'`) are registered automatically. To plug in a third-party provider (Cohere, Mistral, an in-house protocol), implement `ITargetAdapter` and register it once:\n\n```typescript\nimport { adapterRegistry, ITargetAdapter } from \"@context-chef/core\";\n\nclass CohereAdapter implements ITargetAdapter {\n  compile(messages) {\n    /* return Cohere-shaped payload */\n  }\n}\n\nadapterRegistry.register(\"cohere\", new CohereAdapter());\nawait chef.compile({ target: \"cohere\" }); // routed via the registry\n```\n\n`compile({ target })` accepts three forms:\n\n| Form                  | Example                                | Use case                                      |\n| --------------------- | -------------------------------------- | --------------------------------------------- |\n| Built-in literal      | `compile({ target: \"openai\" })`        | Strict payload type via the type overloads    |\n| Registered name       | `compile({ target: \"cohere\" })`        | Reuse the same custom adapter many times      |\n| `ITargetAdapter`      | `compile({ target: new MyAdapter() })` | One-off use / tests — bypasses the registry   |\n\nSet `defaultTarget` once in the constructor to avoid repeating it on every call:\n\n```typescript\nconst chef = new ContextChef({ defaultTarget: \"anthropic\" });\nawait chef.compile(); // → AnthropicPayload\n```\n\nResolution order in `compile()`:\n`options.target` → `ChefConfig.defaultTarget` → `'openai'` (final built-in fallback).\n\nFor plugin systems and test isolation, pass a `sourceId` so a batch of registrations can be torn down together:\n\n```typescript\nadapterRegistry.register(\"cohere\", new CohereAdapter(), \"my-plugin\");\nadapterRegistry.register(\"mistral\", new MistralAdapter(), \"my-plugin\");\n// Later — unload the entire plugin in one call\nadapterRegistry.unregisterBySource(\"my-plugin\");\n```\n\n\u003e **Replacing a built-in name** (e.g. `register('openai', myFork)`) keeps the strict overload's payload return type — `compile({ target: 'openai' })` is still typed `Promise\u003cOpenAIPayload\u003e`, so your replacement must honor that shape at runtime. TypeScript can't enforce this for you.\n\n---\n\n## Skills\n\nContextChef provides [Claude Code Skills](https://docs.anthropic.com/en/docs/claude-code/skills) that help you integrate the library into your project interactively. Each skill analyzes your existing codebase and generates tailored integration code.\n\n| Skill | Description |\n|---|---|\n| `context-chef-core` | Integrate `@context-chef/core` — full control over compilation pipeline, multi-provider support |\n| `context-chef-middleware` | Integrate `@context-chef/ai-sdk-middleware` — drop-in AI SDK middleware, zero code changes |\n| `context-chef-tanstack` | Integrate `@context-chef/tanstack-ai` — TanStack AI ChatMiddleware with compression and state injection |\n\n### Install\n\nInstall only what you need:\n\n```bash\n# Core library (OpenAI / Anthropic / Gemini direct SDK usage)\nnpx skills add MyPrototypeWhat/context-chef --skill context-chef-core\n\n# AI SDK middleware (Vercel AI SDK v6+)\nnpx skills add MyPrototypeWhat/context-chef --skill context-chef-middleware\n\n# TanStack AI middleware (TanStack AI v0.10+)\nnpx skills add MyPrototypeWhat/context-chef --skill context-chef-tanstack\n\n# All\nnpx skills add MyPrototypeWhat/context-chef\n```\n\n### Use\n\nOpen [Claude Code](https://docs.anthropic.com/en/docs/claude-code/overview) in your project and type:\n\n```\n/context-chef-core\n# or\n/context-chef-middleware\n```\n\nClaude will:\n\n1. **Detect your setup** — LLM SDK, package manager, TypeScript vs JavaScript\n2. **Ask about your needs** — history compression, tool management, truncation, memory, etc.\n3. **Generate integration code** — tailored to your project structure and existing agent loop\n4. **Explain the architecture** — processing pipeline, cache breakpoints, dynamic state placement\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmyprototypewhat%2Fcontext-chef","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmyprototypewhat%2Fcontext-chef","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmyprototypewhat%2Fcontext-chef/lists"}