An open API service indexing awesome lists of open source software.

https://github.com/degenhero/context-optimizer-mcp

An MCP server that uses Redis and in-memory caching to optimize and extend context windows for large chat histories
https://github.com/degenhero/context-optimizer-mcp

Last synced: 3 months ago
JSON representation

An MCP server that uses Redis and in-memory caching to optimize and extend context windows for large chat histories

Awesome Lists containing this project

README

          

# Context Optimizer MCP

An MCP (Model Context Protocol) server that uses Redis and in-memory caching to optimize and extend context windows for large chat histories.

## Features

- **Dual-Layer Caching**: Combines fast in-memory LRU cache with persistent Redis storage
- **Smart Context Management**: Automatically summarizes older messages to maintain context within token limits
- **Rate Limiting**: Redis-based rate limiting with burst protection
- **API Compatibility**: Drop-in replacement for Anthropic API with enhanced context handling
- **Metrics Collection**: Built-in performance monitoring and logging

## How It Works

This MCP server acts as a middleware between your application and LLM providers (currently supporting Anthropic's Claude models). It intelligently manages conversation context through these strategies:

1. **Context Window Optimization**: When conversations approach the model's token limit, older messages are automatically summarized while preserving key information.

2. **Efficient Caching**:
- In-memory LRU cache for frequently accessed conversation summaries
- Redis for persistent, distributed storage of conversation history and summaries

3. **Transparent Processing**: The server handles all context management automatically while maintaining compatibility with the standard API.

## Getting Started

### Prerequisites

- Node.js 18+
- Redis server (local or remote)
- Anthropic API key

### Installation Options

#### 1. Using MCP client

The easiest way to install and run this server is using the MCP client:

```bash
# Install via npx
npx mcp install degenhero/context-optimizer-mcp

# Or using uvx
uvx mcp install degenhero/context-optimizer-mcp
```

Make sure to set your Anthropic API key when prompted during installation.

#### 2. Manual Installation

```bash
# Clone the repository
git clone https://github.com/degenhero/context-optimizer-mcp.git
cd context-optimizer-mcp

# Install dependencies
npm install

# Set up environment variables
cp .env.example .env
# Edit .env with your configuration

# Start the server
npm start
```

#### 3. Using Docker

```bash
# Clone the repository
git clone https://github.com/degenhero/context-optimizer-mcp.git
cd context-optimizer-mcp

# Build and start with Docker Compose
docker-compose up -d
```

This will start both the MCP server and a Redis instance.

### Configuration

Configure the server by editing the `.env` file:

```
# Server configuration
PORT=3000

# Anthropic API key
ANTHROPIC_API_KEY=your_anthropic_api_key

# Redis configuration
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_PASSWORD=

# Caching settings
IN_MEMORY_CACHE_MAX_SIZE=1000
REDIS_CACHE_TTL=86400 # 24 hours in seconds

# Model settings
DEFAULT_MODEL=claude-3-opus-20240229
DEFAULT_MAX_TOKENS=4096
```

## API Usage

The server exposes a compatible API endpoint that works like the standard Claude API with additional context optimization features:

```javascript
// Example client usage
const response = await fetch('http://localhost:3000/v1/messages', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'claude-3-opus-20240229',
messages: [
{ role: 'user', content: 'Hello!' },
{ role: 'assistant', content: 'How can I help you today?' },
{ role: 'user', content: 'Tell me about context management.' }
],
max_tokens: 1000,
// Optional MCP-specific parameters:
conversation_id: 'unique-conversation-id', // For context tracking
context_optimization: true, // Enable/disable optimization
}),
});

const result = await response.json();
```

### Additional Endpoints

- `GET /v1/token-count?text=your_text&model=model_name`: Count tokens in a text string
- `GET /health`: Server health check
- `GET /metrics`: View server performance metrics

## Testing

A test script is included to demonstrate how context optimization works:

```bash
# Run the test script
npm run test:context
```

This will start an interactive session where you can have a conversation and see how the context gets optimized as it grows.

## Advanced Features

### Context Summarization

When a conversation exceeds 80% of the model's token limit, the server automatically summarizes older messages. This summarization is cached for future use.

### Conversation Continuity

By providing a consistent `conversation_id` in requests, the server can maintain context across multiple API calls, even if individual requests would exceed token limits.

## Performance Considerations

- In-memory cache provides fastest access for active conversations
- Redis enables persistence and sharing across server instances
- Summarization operations add some latency to requests that exceed token thresholds

## Documentation

Additional documentation can be found in the `docs/` directory:

- [Architecture Overview](docs/ARCHITECTURE.md)
- [Contributing Guide](docs/CONTRIBUTING.md)

## License

MIT