An open API service indexing awesome lists of open source software.

https://github.com/vstorm-co/summarization-pydantic-ai

Context Management processor for Pydantic AI agents, providing LLM-powered summarization or zero-cost sliding window trimming to handle infinite/long-running conversations without context overflow. Supports flexible triggers, safe cutoffs, and custom prompts for efficient AI apps.
https://github.com/vstorm-co/summarization-pydantic-ai

ai-agents anthropic chatgpt claude context-engineering deepagents gemini llm pydantic-ai python

Last synced: 23 days ago
JSON representation

Context Management processor for Pydantic AI agents, providing LLM-powered summarization or zero-cost sliding window trimming to handle infinite/long-running conversations without context overflow. Supports flexible triggers, safe cutoffs, and custom prompts for efficient AI apps.

Awesome Lists containing this project

README

          

Context Management for Pydantic AI


Automatic Conversation Summarization and History Management


PyPI version
Python 3.10+
License: MIT
CI
Pydantic AI


Intelligent Summarization — LLM-powered context compression
 • 
Sliding Window — zero-cost message trimming
 • 
Context Manager — real-time token tracking + tool truncation
 • 
Safe Cutoff — preserves tool call pairs

---

**Context Management for Pydantic AI** helps your [Pydantic AI](https://ai.pydantic.dev/) agents handle long conversations without exceeding model context limits. Choose between intelligent LLM summarization or fast sliding window trimming.

> **Full framework?** Check out [Pydantic Deep Agents](https://github.com/vstorm-co/pydantic-deepagents) — complete agent framework with planning, filesystem, subagents, and skills.

## Use Cases

| What You Want to Build | How This Library Helps |
|------------------------|------------------------|
| **Long-Running Agent** | Automatically compress history when context fills up |
| **Customer Support Bot** | Preserve key details while discarding routine exchanges |
| **Code Assistant** | Keep recent code context, summarize older discussions |
| **High-Throughput App** | Zero-cost sliding window for maximum speed |
| **Cost-Sensitive App** | Choose between quality (summarization) or free (sliding window) |

## Installation

```bash
pip install summarization-pydantic-ai
```

Or with uv:

```bash
uv add summarization-pydantic-ai
```

For accurate token counting:

```bash
pip install summarization-pydantic-ai[tiktoken]
```

For real-time token tracking and tool output truncation:

```bash
pip install summarization-pydantic-ai[hybrid]
```

## Quick Start

```python
from pydantic_ai import Agent
from pydantic_ai_summarization import create_summarization_processor

processor = create_summarization_processor(
trigger=("tokens", 100000),
keep=("messages", 20),
)

agent = Agent(
"openai:gpt-4o",
history_processors=[processor],
)

result = await agent.run("Hello!")
```

**That's it.** Your agent now:

- Monitors conversation size on every turn
- Summarizes older messages when limits are reached
- Preserves tool call/response pairs (never breaks them)
- Keeps recent context intact

## Available Processors

| Processor | LLM Cost | Latency | Context Preservation |
|-----------|----------|---------|---------------------|
| `SummarizationProcessor` | High | High | Intelligent summary |
| `SlidingWindowProcessor` | Zero | ~0ms | Discards old messages |
| `ContextManagerMiddleware` | Per compression | Low tracking / High compression | Intelligent summary |

### Intelligent Summarization

Uses an LLM to create summaries of older messages:

```python
from pydantic_ai_summarization import create_summarization_processor

processor = create_summarization_processor(
trigger=("tokens", 100000), # When to summarize
keep=("messages", 20), # What to keep
)
```

### Zero-Cost Sliding Window

Simply discards old messages — no LLM calls:

```python
from pydantic_ai_summarization import create_sliding_window_processor

processor = create_sliding_window_processor(
trigger=("messages", 100), # When to trim
keep=("messages", 50), # What to keep
)
```

### Real-Time Context Manager

Dual-protocol middleware combining token tracking, auto-compression, and tool output truncation:

```python
from pydantic_ai import Agent
from pydantic_ai_summarization import create_context_manager_middleware

middleware = create_context_manager_middleware(
max_tokens=200_000,
compress_threshold=0.9,
on_usage_update=lambda pct, cur, mx: print(f"{pct:.0%} used ({cur:,}/{mx:,})"),
)

agent = Agent(
"openai:gpt-4o",
history_processors=[middleware],
)
```

Requires `pip install summarization-pydantic-ai[hybrid]`

## Trigger Types

| Type | Example | Description |
|------|---------|-------------|
| `messages` | `("messages", 50)` | Trigger when message count exceeds threshold |
| `tokens` | `("tokens", 100000)` | Trigger when token count exceeds threshold |
| `fraction` | `("fraction", 0.8)` | Trigger at percentage of max_input_tokens |

## Keep Types

| Type | Example | Description |
|------|---------|-------------|
| `messages` | `("messages", 20)` | Keep last N messages |
| `tokens` | `("tokens", 10000)` | Keep last N tokens worth |
| `fraction` | `("fraction", 0.2)` | Keep last N% of context |

## Advanced Configuration

### Multiple Triggers

```python
from pydantic_ai_summarization import SummarizationProcessor

processor = SummarizationProcessor(
model="openai:gpt-4o",
trigger=[
("messages", 50), # OR 50+ messages
("tokens", 100000), # OR 100k+ tokens
],
keep=("messages", 10),
)
```

### Fraction-Based

```python
processor = SummarizationProcessor(
model="openai:gpt-4o",
trigger=("fraction", 0.8), # 80% of context window
keep=("fraction", 0.2), # Keep last 20%
max_input_tokens=128000, # GPT-4's context window
)
```

### Custom Token Counter

```python
def my_token_counter(messages):
return sum(len(str(msg)) for msg in messages) // 4

processor = create_summarization_processor(
token_counter=my_token_counter,
)
```

### Custom Model (e.g., Azure OpenAI)

```python
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider
from pydantic_ai_summarization import create_summarization_processor

azure_model = OpenAIModel(
"gpt-4o",
provider=OpenAIProvider(
base_url="https://my-resource.openai.azure.com/openai/deployments/gpt-4o",
api_key="your-azure-api-key",
),
)

processor = create_summarization_processor(
model=azure_model,
trigger=("tokens", 100000),
keep=("messages", 20),
)
```

### Custom Summary Prompt

```python
processor = create_summarization_processor(
summary_prompt="""
Extract key information from this conversation.
Focus on: decisions made, code written, pending tasks.

Conversation:
{messages}
""",
)
```

## Why Choose This Library?

| Feature | Description |
|---------|-------------|
| **Two Strategies** | Intelligent summarization or fast sliding window |
| **Flexible Triggers** | Message count, token count, or fraction-based |
| **Safe Cutoff** | Never breaks tool call/response pairs |
| **Custom Counters** | Bring your own token counting logic |
| **Custom Prompts** | Control how summaries are generated |
| **Token Tracking** | Real-time usage monitoring with callbacks |
| **Tool Truncation** | Automatic truncation of large tool outputs |
| **Custom Models** | Use any pydantic-ai Model (Azure, custom providers) |
| **Lightweight** | Only requires pydantic-ai-slim (no extra model SDKs) |

## Related Projects

| Package | Description |
|---------|-------------|
| [Pydantic Deep Agents](https://github.com/vstorm-co/pydantic-deepagents) | Full agent framework (uses this library) |
| [pydantic-ai-backend](https://github.com/vstorm-co/pydantic-ai-backend) | File storage and Docker sandbox |
| [pydantic-ai-todo](https://github.com/vstorm-co/pydantic-ai-todo) | Task planning toolset |
| [subagents-pydantic-ai](https://github.com/vstorm-co/subagents-pydantic-ai) | Multi-agent orchestration |
| [pydantic-ai](https://github.com/pydantic/pydantic-ai) | The foundation — agent framework by Pydantic |

## Contributing

```bash
git clone https://github.com/vstorm-co/summarization-pydantic-ai.git
cd summarization-pydantic-ai
make install
make test # 100% coverage required
```

## License

MIT — see [LICENSE](LICENSE)


Built with ❤️ by vstorm-co