{"id":41230219,"url":"https://github.com/vstorm-co/summarization-pydantic-ai","last_synced_at":"2026-04-01T20:50:27.889Z","repository":{"id":333760491,"uuid":"1138580729","full_name":"vstorm-co/summarization-pydantic-ai","owner":"vstorm-co","description":"Context Management processor for Pydantic AI agents, providing LLM-powered summarization or zero-cost sliding window trimming to handle infinite/long-running conversations without context overflow. Supports flexible triggers, safe cutoffs, and custom prompts for efficient AI apps.","archived":false,"fork":false,"pushed_at":"2026-03-31T15:02:31.000Z","size":1148,"stargazers_count":16,"open_issues_count":1,"forks_count":6,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-03-31T17:08:24.498Z","etag":null,"topics":["ai-agents","anthropic","chatgpt","claude","context-engineering","deepagents","gemini","llm","pydantic-ai","python"],"latest_commit_sha":null,"homepage":"https://vstorm-co.github.io/summarization-pydantic-ai/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vstorm-co.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-01-20T21:21:46.000Z","updated_at":"2026-03-31T15:02:14.000Z","dependencies_parsed_at":"2026-03-31T17:03:25.732Z","dependency_job_id":null,"html_url":"https://github.com/vstorm-co/summarization-pydantic-ai","commit_stats":null,"previous_names":["vstorm-co/summarization-pydantic-ai"],"tags_count":9,"template":false,"template_full_name":null,"purl":"pkg:github/vstorm-co/summarization-pydantic-ai","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vstorm-co%2Fsummarization-pydantic-ai","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vstorm-co%2Fsummarization-pydantic-ai/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vstorm-co%2Fsummarization-pydantic-ai/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vstorm-co%2Fsummarization-pydantic-ai/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vstorm-co","download_url":"https://codeload.github.com/vstorm-co/summarization-pydantic-ai/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vstorm-co%2Fsummarization-pydantic-ai/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31291837,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-01T13:12:26.723Z","status":"ssl_error","status_checked_at":"2026-04-01T13:12:25.102Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","anthropic","chatgpt","claude","context-engineering","deepagents","gemini","llm","pydantic-ai","python"],"created_at":"2026-01-23T00:49:42.435Z","updated_at":"2026-04-01T20:50:27.881Z","avatar_url":"https://github.com/vstorm-co.png","language":"Python","funding_links":[],"categories":["Frameworks \u0026 Libraries"],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003eContext Management for Pydantic AI\u003c/h1\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cem\u003eAutomatic Conversation Summarization and History Management\u003c/em\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://pypi.org/project/summarization-pydantic-ai/\"\u003e\u003cimg src=\"https://img.shields.io/pypi/v/summarization-pydantic-ai.svg\" alt=\"PyPI version\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://www.python.org/downloads/\"\u003e\u003cimg src=\"https://img.shields.io/badge/python-3.10+-blue.svg\" alt=\"Python 3.10+\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://opensource.org/licenses/MIT\"\u003e\u003cimg src=\"https://img.shields.io/badge/License-MIT-yellow.svg\" alt=\"License: MIT\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/vstorm-co/summarization-pydantic-ai/actions/workflows/ci.yml\"\u003e\u003cimg src=\"https://github.com/vstorm-co/summarization-pydantic-ai/actions/workflows/ci.yml/badge.svg\" alt=\"CI\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/pydantic/pydantic-ai\"\u003e\u003cimg src=\"https://img.shields.io/badge/Powered%20by-Pydantic%20AI-E92063?logo=pydantic\u0026logoColor=white\" alt=\"Pydantic AI\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cb\u003eIntelligent Summarization\u003c/b\u003e — LLM-powered context compression\n  \u0026nbsp;\u0026bull;\u0026nbsp;\n  \u003cb\u003eSliding Window\u003c/b\u003e — zero-cost message trimming\n  \u0026nbsp;\u0026bull;\u0026nbsp;\n  \u003cb\u003eLimit Warnings\u003c/b\u003e — finish-soon guidance before hard caps\n  \u0026nbsp;\u0026bull;\u0026nbsp;\n  \u003cb\u003eContext Manager\u003c/b\u003e — real-time token tracking + tool truncation\n  \u0026nbsp;\u0026bull;\u0026nbsp;\n  \u003cb\u003eSafe Cutoff\u003c/b\u003e — preserves tool call pairs\n\u003c/p\u003e\n\n---\n\n**Context Management for Pydantic AI** helps your [Pydantic AI](https://ai.pydantic.dev/) agents handle long conversations without exceeding model context limits. Choose between intelligent LLM summarization or fast sliding window trimming.\n\n\u003e **Full framework?** Check out [Pydantic Deep Agents](https://github.com/vstorm-co/pydantic-deepagents) — complete agent framework with planning, filesystem, subagents, and skills.\n\n## Use Cases\n\n| What You Want to Build | How This Library Helps |\n|------------------------|------------------------|\n| **Long-Running Agent** | Automatically compress history when context fills up |\n| **Customer Support Bot** | Preserve key details while discarding routine exchanges |\n| **Code Assistant** | Keep recent code context, summarize older discussions |\n| **High-Throughput App** | Zero-cost sliding window for maximum speed |\n| **Cost-Sensitive App** | Choose between quality (summarization) or free (sliding window) |\n\n## Installation\n\n```bash\npip install summarization-pydantic-ai\n```\n\nOr with uv:\n\n```bash\nuv add summarization-pydantic-ai\n```\n\nFor accurate token counting:\n\n```bash\npip install summarization-pydantic-ai[tiktoken]\n```\n\n## Quick Start — Capabilities (Recommended)\n\nThe recommended way to add context management is via pydantic-ai's native [Capabilities API](https://ai.pydantic.dev/capabilities/):\n\n```python\nfrom pydantic_ai import Agent\nfrom pydantic_ai_summarization import ContextManagerCapability\n\nagent = Agent(\n    \"openai:gpt-4.1\",\n    capabilities=[ContextManagerCapability(max_tokens=100_000)],\n)\n\nresult = await agent.run(\"Hello!\")\n```\n\n**That's it.** Your agent now:\n\n- Tracks token usage on every turn\n- Auto-compresses when approaching the limit (90% by default)\n- Truncates large tool outputs\n- Auto-detects context window size from the model\n- Preserves tool call/response pairs (never breaks them)\n\n### Combine with Limit Warnings\n\n```python\nfrom pydantic_ai_summarization import ContextManagerCapability, LimitWarnerCapability\n\nagent = Agent(\n    \"openai:gpt-4.1\",\n    capabilities=[\n        LimitWarnerCapability(max_iterations=40, max_context_tokens=100_000),\n        ContextManagerCapability(max_tokens=100_000),\n    ],\n)\n```\n\n### Alternative: Processor API\n\nFor standalone use without capabilities:\n\n```python\nfrom pydantic_ai import Agent\nfrom pydantic_ai_summarization import create_summarization_processor\n\nprocessor = create_summarization_processor(\n    trigger=(\"tokens\", 100000),\n    keep=(\"messages\", 20),\n)\n\nagent = Agent(\"openai:gpt-4.1\", history_processors=[processor])\n```\n\n## Available Processors\n\n| Processor | LLM Cost | Latency | Context Preservation |\n|-----------|----------|---------|---------------------|\n| `ContextManagerCapability` | Per compression | Low tracking | Intelligent summary + tool truncation |\n| `SummarizationProcessor` | High | High | Intelligent summary |\n| `SlidingWindowProcessor` | Zero | ~0ms | Discards old messages |\n| `LimitWarnerProcessor` | Zero | ~0ms | Full history + warning injection |\n\n### Intelligent Summarization\n\nUses an LLM to create summaries of older messages:\n\n```python\nfrom pydantic_ai_summarization import create_summarization_processor\n\nprocessor = create_summarization_processor(\n    trigger=(\"tokens\", 100000),  # When to summarize\n    keep=(\"messages\", 20),       # What to keep\n)\n```\n\n### Zero-Cost Sliding Window\n\nSimply discards old messages — no LLM calls:\n\n```python\nfrom pydantic_ai_summarization import create_sliding_window_processor\n\nprocessor = create_sliding_window_processor(\n    trigger=(\"messages\", 100),  # When to trim\n    keep=(\"messages\", 50),      # What to keep\n)\n```\n\n### Limit Warnings\n\nWarn the agent before requests, context usage, or total tokens hit a cap:\n\n```python\nfrom pydantic_ai_summarization import create_limit_warner_processor\n\nprocessor = create_limit_warner_processor(\n    max_iterations=40,\n    max_context_tokens=100000,\n    max_total_tokens=200000,\n)\n```\n\n### Context Manager Capability\n\nFull context management with token tracking, auto-compression, and tool output truncation:\n\n```python\nfrom pydantic_ai import Agent\nfrom pydantic_ai_summarization import ContextManagerCapability\n\nagent = Agent(\n    \"openai:gpt-4.1\",\n    capabilities=[ContextManagerCapability(\n        max_tokens=100_000,\n        compress_threshold=0.9,\n        max_tool_output_tokens=5000,\n    )],\n)\n```\n\n## Trigger Types\n\n| Type | Example | Description |\n|------|---------|-------------|\n| `messages` | `(\"messages\", 50)` | Trigger when message count exceeds threshold |\n| `tokens` | `(\"tokens\", 100000)` | Trigger when token count exceeds threshold |\n| `fraction` | `(\"fraction\", 0.8)` | Trigger at percentage of max_input_tokens |\n\n## Keep Types\n\n| Type | Example | Description |\n|------|---------|-------------|\n| `messages` | `(\"messages\", 20)` | Keep last N messages |\n| `tokens` | `(\"tokens\", 10000)` | Keep last N tokens worth |\n| `fraction` | `(\"fraction\", 0.2)` | Keep last N% of context |\n\n## Advanced Configuration\n\n### Multiple Triggers\n\n```python\nfrom pydantic_ai_summarization import SummarizationProcessor\n\nprocessor = SummarizationProcessor(\n    model=\"openai:gpt-4o\",\n    trigger=[\n        (\"messages\", 50),    # OR 50+ messages\n        (\"tokens\", 100000),  # OR 100k+ tokens\n    ],\n    keep=(\"messages\", 10),\n)\n```\n\n### Fraction-Based\n\n```python\nprocessor = SummarizationProcessor(\n    model=\"openai:gpt-4o\",\n    trigger=(\"fraction\", 0.8),  # 80% of context window\n    keep=(\"fraction\", 0.2),     # Keep last 20%\n    max_input_tokens=128000,    # GPT-4's context window\n)\n```\n\n### Custom Token Counter\n\n```python\ndef my_token_counter(messages):\n    return sum(len(str(msg)) for msg in messages) // 4\n\nprocessor = create_summarization_processor(\n    token_counter=my_token_counter,\n)\n```\n\n### Custom Model (e.g., Azure OpenAI)\n\n```python\nfrom pydantic_ai.models.openai import OpenAIModel\nfrom pydantic_ai.providers.openai import OpenAIProvider\nfrom pydantic_ai_summarization import create_summarization_processor\n\nazure_model = OpenAIModel(\n    \"gpt-4o\",\n    provider=OpenAIProvider(\n        base_url=\"https://my-resource.openai.azure.com/openai/deployments/gpt-4o\",\n        api_key=\"your-azure-api-key\",\n    ),\n)\n\nprocessor = create_summarization_processor(\n    model=azure_model,\n    trigger=(\"tokens\", 100000),\n    keep=(\"messages\", 20),\n)\n```\n\n### Custom Summary Prompt\n\n```python\nprocessor = create_summarization_processor(\n    summary_prompt=\"\"\"\n    Extract key information from this conversation.\n    Focus on: decisions made, code written, pending tasks.\n\n    Conversation:\n    {messages}\n    \"\"\",\n)\n```\n\n## Why Choose This Library?\n\n| Feature | Description |\n|---------|-------------|\n| **Two Strategies** | Intelligent summarization or fast sliding window |\n| **Flexible Triggers** | Message count, token count, or fraction-based |\n| **Safe Cutoff** | Never breaks tool call/response pairs |\n| **Auto max_tokens** | Auto-detect context window from genai-prices |\n| **Message Persistence** | Save all messages to JSON for session resume |\n| **Guided Compaction** | Focus summaries on specific topics |\n| **Callbacks** | on_before/after_compress with instruction re-injection |\n| **Async Token Counting** | Sync or async token counter support |\n| **Token Tracking** | Real-time usage monitoring with callbacks |\n| **Tool Truncation** | Automatic truncation of large tool outputs |\n| **Custom Models** | Use any pydantic-ai Model (Azure, custom providers) |\n| **Lightweight** | Only requires pydantic-ai-slim (no extra model SDKs) |\n\n## Related Projects\n\n| Package | Description |\n|---------|-------------|\n| [Pydantic Deep Agents](https://github.com/vstorm-co/pydantic-deepagents) | Full agent framework (uses this library) |\n| [pydantic-ai-backend](https://github.com/vstorm-co/pydantic-ai-backend) | File storage and Docker sandbox |\n| [pydantic-ai-todo](https://github.com/vstorm-co/pydantic-ai-todo) | Task planning toolset |\n| [subagents-pydantic-ai](https://github.com/vstorm-co/subagents-pydantic-ai) | Multi-agent orchestration |\n| [pydantic-ai](https://github.com/pydantic/pydantic-ai) | The foundation — agent framework by Pydantic |\n\n## Contributing\n\n```bash\ngit clone https://github.com/vstorm-co/summarization-pydantic-ai.git\ncd summarization-pydantic-ai\nmake install\nmake test  # 100% coverage required\n```\n\n## License\n\nMIT — see [LICENSE](LICENSE)\n\n---\n\n\u003cdiv align=\"center\"\u003e\n\n### Need help implementing this in your company?\n\n\u003cp\u003eWe're \u003ca href=\"https://vstorm.co\"\u003e\u003cb\u003eVstorm\u003c/b\u003e\u003c/a\u003e — an Applied Agentic AI Engineering Consultancy\u003cbr\u003ewith 30+ production AI agent implementations.\u003c/p\u003e\n\n\u003ca href=\"https://vstorm.co/contact-us/\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Talk%20to%20us%20%E2%86%92-0066FF?style=for-the-badge\u0026logoColor=white\" alt=\"Talk to us\"\u003e\n\u003c/a\u003e\n\n\u003cbr\u003e\u003cbr\u003e\n\nMade with ❤️ by \u003ca href=\"https://vstorm.co\"\u003e\u003cb\u003eVstorm\u003c/b\u003e\u003c/a\u003e\n\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvstorm-co%2Fsummarization-pydantic-ai","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvstorm-co%2Fsummarization-pydantic-ai","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvstorm-co%2Fsummarization-pydantic-ai/lists"}