https://github.com/mcp-tool-shop-org/context-window-manager

MCP server for lossless LLM context restoration via KV cache persistence
https://github.com/mcp-tool-shop-org/context-window-manager

ai claude context-management context-window kv-cache llm llm-inference lmcache machine-learning mcp mcp-server memory model-context-protocol python rag session-management token-management vllm

Last synced: 13 days ago
JSON representation

MCP server for lossless LLM context restoration via KV cache persistence

Host: GitHub
URL: https://github.com/mcp-tool-shop-org/context-window-manager
Owner: mcp-tool-shop-org
License: mit
Created: 2026-01-23T14:21:05.000Z (about 2 months ago)
Default Branch: main
Last Pushed: 2026-02-22T15:36:15.000Z (17 days ago)
Last Synced: 2026-02-22T20:38:08.854Z (17 days ago)
Topics: ai, claude, context-management, context-window, kv-cache, llm, llm-inference, lmcache, machine-learning, mcp, mcp-server, memory, model-context-protocol, python, rag, session-management, token-management, vllm
Language: Python
Size: 244 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 6
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Security: docs/SECURITY.md
- Roadmap: docs/ROADMAP.md

Awesome Lists containing this project

README

          


  日本語 | 中文 | Español | Français | हिन्दी | Italiano | Português (BR)





  





  

  

  

  



> **Lossless context restoration for LLM sessions via KV cache persistence**

---

## What is this?

Context Window Manager (CWM) is an MCP server that solves the **context exhaustion problem** in LLM applications. Instead of losing your conversation history when context fills up, CWM lets you:

- **Freeze** your current context to persistent storage

- **Thaw** it back later with zero information loss

- **Clone** contexts to explore different conversation branches

- **Resume** exactly where you left off

Unlike summarization or RAG approaches, CWM preserves the actual KV cache tensors, giving you **true, lossless restoration**.

---

## How it works

```

Traditional Approach (Lossy):

┌─────────────────────────────────────────────┐

│ Context fills up → Summarize → Lose details │

└─────────────────────────────────────────────┘

CWM Approach (Lossless):

┌──────────────────────────────────────────────────────────────┐

│ Context fills up → Freeze KV cache → Store tensors → Thaw   │

│                                                    ↓        │

│                              Exact restoration, zero loss   │

└──────────────────────────────────────────────────────────────┘

```

CWM leverages:

- **vLLM's prefix caching** with `cache_salt` for session isolation

- **LMCache** for tiered KV cache storage (GPU → CPU → Disk → Redis)

- **MCP protocol** for seamless integration with Claude Code and other MCP clients

---

## Quick Start

### Prerequisites

- Python 3.11+

- vLLM server with prefix caching enabled

- LMCache configured with vLLM

### Installation

```bash

pip install cwm-mcp

```

### Configuration

Add to your Claude Code settings (`.claude/settings.json`):

```json

{

  "mcpServers": {

    "context-window-manager": {

      "command": "python",

      "args": ["-m", "context_window_manager"],

      "env": {

        "CWM_VLLM_URL": "http://localhost:8000"

      }

    }

  }

}

```

### Usage

```

# Freeze your current session

> window_freeze session_abc123 my-coding-project

# Later, restore it

> window_thaw my-coding-project

# List all saved windows

> window_list

# Check status

> window_status my-coding-project

```

---

## Features

### Core Operations

| Tool | Description |

|------|-------------|

| `window_freeze` | Snapshot session context to storage |

| `window_thaw` | Restore context from a saved window |

| `window_list` | List available context windows |

| `window_status` | Get detailed session/window info |

| `window_clone` | Branch a context for exploration |

| `window_delete` | Remove a saved window |

### Storage Tiers

CWM automatically manages storage across tiers:

1. **CPU Memory** - Fast, limited capacity

2. **Disk** - Large capacity, compressed

3. **Redis** - Distributed, shared across instances

### Session Isolation

Each session gets a unique `cache_salt`, ensuring:

- No cross-session data leakage

- Protection against timing attacks

- Clean separation of contexts

---

## Documentation

| Document | Description |

|----------|-------------|

| [USER_GUIDE.md](docs/USER_GUIDE.md) | Getting started and workflows |

| [API.md](docs/API.md) | Complete API reference |

| [ARCHITECTURE.md](docs/ARCHITECTURE.md) | Technical architecture deep-dive |

| [SECURITY.md](docs/SECURITY.md) | Security considerations |

| [ERROR_HANDLING.md](docs/ERROR_HANDLING.md) | Error taxonomy and handling |

| [ROADMAP.md](docs/ROADMAP.md) | Development phases and milestones |

| [CONTRIBUTING.md](docs/CONTRIBUTING.md) | Development guidelines |

---

## Requirements

### vLLM Server Configuration

```bash

vllm serve "meta-llama/Llama-3.1-8B-Instruct" \

  --enable-prefix-caching \

  --kv-transfer-config '{"kv_connector":"LMCacheConnectorV1","kv_role":"kv_both"}'

```

### LMCache Environment

```bash

export LMCACHE_USE_EXPERIMENTAL=True

export LMCACHE_LOCAL_CPU=True

export LMCACHE_MAX_LOCAL_CPU_SIZE=8.0

```

---

## Development

```bash

# Clone and setup

git clone https://github.com/mcp-tool-shop-org/context-window-manager.git

cd context-window-manager

python -m venv .venv

.venv\Scripts\activate  # Windows

pip install -e ".[dev]"

# Run tests

pytest tests/unit/

# Run with coverage

pytest tests/unit/ --cov=src/context_window_manager

```

See [CONTRIBUTING.md](docs/CONTRIBUTING.md) for detailed guidelines.

---

## Roadmap

- [x] Phase 0: Documentation & Architecture

- [x] Phase 1: Core Infrastructure

- [x] Phase 2: MCP Server Shell

- [x] Phase 3: Freeze Implementation

- [x] Phase 4: Thaw Implementation

- [x] Phase 5: Advanced Features (clone, auto-freeze)

- [x] Phase 6: Production Hardening

- [x] Phase 7: Integration & Polish

See [ROADMAP.md](docs/ROADMAP.md) for details.

---

## License

MIT License - see [LICENSE](LICENSE) for details.

---

## Acknowledgments

- [vLLM](https://github.com/vllm-project/vllm) - High-throughput LLM serving

- [LMCache](https://github.com/LMCache/LMCache) - KV cache persistence layer

- [Model Context Protocol](https://modelcontextprotocol.io/) - Integration standard

- [Recursive Language Models](https://arxiv.org/abs/2512.24601) - Inspiration for context management

---

## Status

**Beta (v0.6.4)** - Production hardening complete. CI consolidated (2 workflows). 366 tests passing.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mcp-tool-shop-org/context-window-manager

Awesome Lists containing this project

README