https://github.com/degenhero/context-optimizer-mcp
An MCP server that uses Redis and in-memory caching to optimize and extend context windows for large chat histories
https://github.com/degenhero/context-optimizer-mcp
Last synced: 3 months ago
JSON representation
An MCP server that uses Redis and in-memory caching to optimize and extend context windows for large chat histories
- Host: GitHub
- URL: https://github.com/degenhero/context-optimizer-mcp
- Owner: degenhero
- Created: 2025-03-29T18:47:49.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-03-29T18:56:56.000Z (7 months ago)
- Last Synced: 2025-06-26T10:51:47.878Z (4 months ago)
- Language: JavaScript
- Size: 22.5 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: docs/CONTRIBUTING.md
Awesome Lists containing this project
- awesome-mcp-servers - **context-optimizer-mcp** - An MCP server that uses Redis and in-memory caching to optimize and extend context windows for large chat histories `javascript` `mcp` `server` `redis` `npm install degenhero/context-optimizer-mcp` (🌐 Web Development)
README
# Context Optimizer MCP
An MCP (Model Context Protocol) server that uses Redis and in-memory caching to optimize and extend context windows for large chat histories.
## Features
- **Dual-Layer Caching**: Combines fast in-memory LRU cache with persistent Redis storage
- **Smart Context Management**: Automatically summarizes older messages to maintain context within token limits
- **Rate Limiting**: Redis-based rate limiting with burst protection
- **API Compatibility**: Drop-in replacement for Anthropic API with enhanced context handling
- **Metrics Collection**: Built-in performance monitoring and logging
## How It Works
This MCP server acts as a middleware between your application and LLM providers (currently supporting Anthropic's Claude models). It intelligently manages conversation context through these strategies:
1. **Context Window Optimization**: When conversations approach the model's token limit, older messages are automatically summarized while preserving key information.
2. **Efficient Caching**:
- In-memory LRU cache for frequently accessed conversation summaries
- Redis for persistent, distributed storage of conversation history and summaries
3. **Transparent Processing**: The server handles all context management automatically while maintaining compatibility with the standard API.
## Getting Started
### Prerequisites
- Node.js 18+
- Redis server (local or remote)
- Anthropic API key
### Installation Options
#### 1. Using MCP client
The easiest way to install and run this server is using the MCP client:
```bash
# Install via npx
npx mcp install degenhero/context-optimizer-mcp
# Or using uvx
uvx mcp install degenhero/context-optimizer-mcp
```
Make sure to set your Anthropic API key when prompted during installation.
#### 2. Manual Installation
```bash
# Clone the repository
git clone https://github.com/degenhero/context-optimizer-mcp.git
cd context-optimizer-mcp
# Install dependencies
npm install
# Set up environment variables
cp .env.example .env
# Edit .env with your configuration
# Start the server
npm start
```
#### 3. Using Docker
```bash
# Clone the repository
git clone https://github.com/degenhero/context-optimizer-mcp.git
cd context-optimizer-mcp
# Build and start with Docker Compose
docker-compose up -d
```
This will start both the MCP server and a Redis instance.
### Configuration
Configure the server by editing the `.env` file:
```
# Server configuration
PORT=3000
# Anthropic API key
ANTHROPIC_API_KEY=your_anthropic_api_key
# Redis configuration
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_PASSWORD=
# Caching settings
IN_MEMORY_CACHE_MAX_SIZE=1000
REDIS_CACHE_TTL=86400 # 24 hours in seconds
# Model settings
DEFAULT_MODEL=claude-3-opus-20240229
DEFAULT_MAX_TOKENS=4096
```
## API Usage
The server exposes a compatible API endpoint that works like the standard Claude API with additional context optimization features:
```javascript
// Example client usage
const response = await fetch('http://localhost:3000/v1/messages', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'claude-3-opus-20240229',
messages: [
{ role: 'user', content: 'Hello!' },
{ role: 'assistant', content: 'How can I help you today?' },
{ role: 'user', content: 'Tell me about context management.' }
],
max_tokens: 1000,
// Optional MCP-specific parameters:
conversation_id: 'unique-conversation-id', // For context tracking
context_optimization: true, // Enable/disable optimization
}),
});
const result = await response.json();
```
### Additional Endpoints
- `GET /v1/token-count?text=your_text&model=model_name`: Count tokens in a text string
- `GET /health`: Server health check
- `GET /metrics`: View server performance metrics
## Testing
A test script is included to demonstrate how context optimization works:
```bash
# Run the test script
npm run test:context
```
This will start an interactive session where you can have a conversation and see how the context gets optimized as it grows.
## Advanced Features
### Context Summarization
When a conversation exceeds 80% of the model's token limit, the server automatically summarizes older messages. This summarization is cached for future use.
### Conversation Continuity
By providing a consistent `conversation_id` in requests, the server can maintain context across multiple API calls, even if individual requests would exceed token limits.
## Performance Considerations
- In-memory cache provides fastest access for active conversations
- Redis enables persistence and sharing across server instances
- Summarization operations add some latency to requests that exceed token thresholds
## Documentation
Additional documentation can be found in the `docs/` directory:
- [Architecture Overview](docs/ARCHITECTURE.md)
- [Contributing Guide](docs/CONTRIBUTING.md)
## License
MIT