{"id":48670294,"url":"https://github.com/shipiit/shipit_agent","last_synced_at":"2026-04-15T17:01:10.334Z","repository":{"id":350240895,"uuid":"1205870217","full_name":"shipiit/shipit_agent","owner":"shipiit","description":"Powerful Python agent runtime with tools, MCP, Hooks, Skills, Rag, memory, sessions, reasoning, and streaming packets.","archived":false,"fork":false,"pushed_at":"2026-04-13T08:10:48.000Z","size":11908,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-14T16:03:30.730Z","etag":null,"topics":["agent","agent-framework","agentic-ai","ai","automation","bedrock","hooks","llm","mcp","mcp-server","multi-agent","python","rag","skills","streaming","tool-calling"],"latest_commit_sha":null,"homepage":"https://docs.shipiit.com/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/shipiit.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-09T11:07:36.000Z","updated_at":"2026-04-12T19:26:18.000Z","dependencies_parsed_at":null,"dependency_job_id":"7af4a119-5762-4f04-b7ea-59462a2c1a45","html_url":"https://github.com/shipiit/shipit_agent","commit_stats":null,"previous_names":["shipiit/shipit_agent"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/shipiit/shipit_agent","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shipiit%2Fshipit_agent","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shipiit%2Fshipit_agent/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shipiit%2Fshipit_agent/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shipiit%2Fshipit_agent/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/shipiit","download_url":"https://codeload.github.com/shipiit/shipit_agent/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shipiit%2Fshipit_agent/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31851057,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-15T15:24:51.572Z","status":"ssl_error","status_checked_at":"2026-04-15T15:24:39.138Z","response_time":63,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent","agent-framework","agentic-ai","ai","automation","bedrock","hooks","llm","mcp","mcp-server","multi-agent","python","rag","skills","streaming","tool-calling"],"created_at":"2026-04-10T12:05:47.049Z","updated_at":"2026-04-15T17:01:10.316Z","avatar_url":"https://github.com/shipiit.png","language":"Python","readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"shipit-icon.svg\" alt=\"SHIPIT\" width=\"120\" height=\"120\" /\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003eSHIPIT Agent\u003c/h1\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cstrong\u003eA clean, powerful open-source Python agent library for building tool-using agents with MCP, browser workflows, local code execution, runtime policies, and structured streaming events.\u003c/strong\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cem\u003eBuild agents with local tools, remote MCP servers, memory, sessions, artifact generation, and multiple LLM providers through one consistent runtime.\u003c/em\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://docs.shipiit.com/\"\u003e\u003cstrong\u003e📖 Documentation\u003c/strong\u003e\u003c/a\u003e ·\n  \u003ca href=\"https://pypi.org/project/shipit-agent/\"\u003e\u003cstrong\u003e📦 PyPI\u003c/strong\u003e\u003c/a\u003e ·\n  \u003ca href=\"https://docs.shipiit.com/getting-started/quickstart/\"\u003eQuick start\u003c/a\u003e ·\n  \u003ca href=\"https://docs.shipiit.com/guides/streaming/\"\u003eStreaming\u003c/a\u003e ·\n  \u003ca href=\"https://docs.shipiit.com/guides/reasoning/\"\u003eReasoning\u003c/a\u003e ·\n  \u003ca href=\"https://docs.shipiit.com/guides/tool-search/\"\u003eTool search\u003c/a\u003e ·\n  \u003ca href=\"SECURITY.md\"\u003eSecurity\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cem\u003eReadable docs, explicit tools, and a runtime that is small enough to extend without fighting framework overhead.\u003c/em\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://pypi.org/project/shipit-agent/\"\u003e\u003cimg src=\"https://img.shields.io/pypi/v/shipit-agent?style=for-the-badge\u0026color=blue\u0026label=pypi\" alt=\"PyPI\" /\u003e\u003c/a\u003e\n  \u003ca href=\"https://pypi.org/project/shipit-agent/\"\u003e\u003cimg src=\"https://img.shields.io/pypi/pyversions/shipit-agent?style=for-the-badge\u0026color=green\" alt=\"Python versions\" /\u003e\u003c/a\u003e\n  \u003ca href=\"https://pypi.org/project/shipit-agent/\"\u003e\u003cimg src=\"https://img.shields.io/pypi/dm/shipit-agent?style=for-the-badge\u0026color=purple\u0026label=downloads\" alt=\"Downloads\" /\u003e\u003c/a\u003e\n  \u003ca href=\"LICENSE.md\"\u003e\u003cimg src=\"https://img.shields.io/badge/license-MIT-yellow?style=for-the-badge\" alt=\"License\" /\u003e\u003c/a\u003e\n  \u003ca href=\"https://docs.shipiit.com/\"\u003e\u003cimg src=\"https://img.shields.io/badge/docs-mkdocs--material-483D8B?style=for-the-badge\" alt=\"Docs\" /\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cstrong\u003eInstall\u003c/strong\u003e \u0026nbsp;·\u0026nbsp; \u003ccode\u003epip install shipit-agent\u003c/code\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Anthropic-native-D77757?style=flat-square\u0026logo=anthropic\" alt=\"Anthropic\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/AWS%20Bedrock-supported-orange?style=flat-square\u0026logo=amazon-aws\" alt=\"Bedrock\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/OpenAI-supported-412991?style=flat-square\u0026logo=openai\" alt=\"OpenAI\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Gemini-supported-4285F4?style=flat-square\u0026logo=google\" alt=\"Gemini\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Ollama-supported-black?style=flat-square\" alt=\"Ollama\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Vertex%20AI-supported-34A853?style=flat-square\u0026logo=googlecloud\" alt=\"Vertex AI\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Together%20AI-supported-blue?style=flat-square\" alt=\"Together\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Groq-supported-red?style=flat-square\" alt=\"Groq\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/OpenRouter-supported-black?style=flat-square\" alt=\"OpenRouter\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Custom%20API-supported-gray?style=flat-square\" alt=\"Custom\" /\u003e\n\u003c/p\u003e\n\n## 🚀 What's new in 1.0.3\n\n**SHIPIT Agent 1.0.3** ships **Super RAG**, the **DeepAgent factory**, a **live multi-agent chat REPL**, and an **Agent memory cookbook**. **521 unit tests. 19 Bedrock end-to-end smoke tests. All passing.**\n\n### Super RAG — hybrid search with auto-cited sources\n\n```python\nfrom shipit_agent import Agent\nfrom shipit_agent.rag import RAG, HashingEmbedder\n\nrag = RAG.default(embedder=HashingEmbedder(dimension=512))\nrag.index_file(\"docs/manual.pdf\")\n\nagent = Agent.with_builtins(llm=llm, rag=rag)\nresult = agent.run(\"How do I configure logging?\")\n\nprint(result.output)              # \"Set SHIPIT_LOG_LEVEL=debug. [1]\"\nfor src in result.rag_sources:     # DRK_CACHE-style citation panel\n    print(f\"[{src.index}] {src.source}: {src.text[:80]}\")\n```\n\nPluggable `VectorStore` / `KeywordStore` / `Embedder` / `Reranker` protocols, hybrid vector+BM25 search with Reciprocal Rank Fusion, context expansion, optional recency bias, and a thread-local per-run source tracker.\n\n### DeepAgent — one factory, all the power\n\n```python\nfrom shipit_agent.deep import DeepAgent, Goal\n\nagent = DeepAgent.with_builtins(\n    llm=llm,\n    rag=rag,                                 # grounded answers\n    verify=True,                              # verifier after every answer\n    reflect=True,                             # self-critique loop\n    goal=Goal(                                # goal-driven decomposition\n        objective=\"Ship the auth fix\",\n        success_criteria=[\"Patch compiles\", \"Tests pass\"],\n    ),\n    agents=[researcher, writer, reviewer],    # named sub-agent delegates\n)\nresult = agent.run()\n```\n\nSeven deep tools wired automatically (`plan_task`, `decompose_problem`, `workspace_files`, `sub_agent`, `synthesize_evidence`, `decision_matrix`, `verify_output`). `create_deep_agent()` gives you the functional spelling and auto-wraps plain Python functions as tools.\n\n### Live chat — `shipit chat`\n\n```bash\nshipit chat                                    # default: DeepAgent\nshipit chat --agent goal --goal \"Build a CLI\"\nshipit chat --rag-file docs/manual.pdf --reflect --verify\n```\n\nModern multi-agent terminal REPL. Switch agent types live with `/agent`, index files mid-session with `/index`, save/load conversations, toggle `reflect`/`verify`, inspect sources. Works with every LLM provider.\n\n### Agent memory — OpenAI-style \"remember things\"\n\n```python\nfrom shipit_agent import Agent, AgentMemory\nfrom shipit_agent.stores import FileMemoryStore, FileSessionStore\n\nprofile = AgentMemory.default(llm=llm, embedding_fn=embed)\nprofile.add_fact(\"user_timezone=Europe/Berlin\")\n\nagent = Agent.with_builtins(\n    llm=llm,\n    memory_store=FileMemoryStore(root=\"~/.shipit/memory\"),      # LLM-writable\n    session_store=FileSessionStore(root=\"~/.shipit/sessions\"),  # chat history\n    history=profile.get_conversation_messages(),                # curated profile\n)\n```\n\nTwo complementary memory systems: `memory_store=` for the LLM's `memory` tool, `AgentMemory` for application-curated profiles. Full cookbook in `docs/agent/memory.md`.\n\n---\n\n## 🎯 Also new in 1.0.3\n\n**SHIPIT Agent 1.0.2** (still available) introduced deep agents, structured output, pipelines, agent teams, advanced memory, and output parsers. 1.0.3 builds directly on that foundation.\n\n### Deep Agents — Beyond LangChain\n\n```python\nfrom shipit_agent.deep import GoalAgent, Goal, ReflectiveAgent, Supervisor, Worker\n\n# GoalAgent — autonomous goal decomposition with streaming\nagent = GoalAgent.with_builtins(llm=llm, goal=Goal(\n    objective=\"Build a comparison of Python web frameworks\",\n    success_criteria=[\"Covers Django, Flask, FastAPI\", \"Includes benchmarks\"],\n))\nfor event in agent.stream():\n    print(f\"[{event.type}] {event.message}\")\n    if event.payload.get(\"output\"):\n        print(event.payload[\"output\"][:200])\n\n# ReflectiveAgent — self-improving with quality scores\nagent = ReflectiveAgent.with_builtins(llm=llm, quality_threshold=0.8)\nresult = agent.run(\"Explain the CAP theorem\")\nprint(f\"Quality: {result.final_quality}, Revisions: {len(result.revisions)}\")\n\n# Supervisor — hierarchical multi-agent management\nsupervisor = Supervisor.with_builtins(llm=llm, worker_configs=[\n    {\"name\": \"analyst\", \"prompt\": \"You analyze data.\"},\n    {\"name\": \"writer\", \"prompt\": \"You write reports.\"},\n])\nfor event in supervisor.stream(\"Analyze AI trends and write a summary\"):\n    print(f\"[{event.payload.get('worker', 'supervisor')}] {event.message}\")\n```\n\n### Structured Output — One Parameter\n\n```python\nfrom pydantic import BaseModel\n\nclass Analysis(BaseModel):\n    sentiment: str\n    confidence: float\n    topics: list[str]\n\nresult = agent.run(\"Analyze this review\", output_schema=Analysis)\nresult.parsed.sentiment   # \"positive\"\nresult.parsed.confidence  # 0.95\n```\n\n### Pipeline Composition\n\n```python\nfrom shipit_agent import Pipeline, step, parallel\n\npipe = Pipeline(\n    parallel(\n        step(\"research\", agent=researcher, prompt=\"Research {topic}\"),\n        step(\"trends\", agent=analyst, prompt=\"Trends in {topic}\"),\n    ),\n    step(\"write\", agent=writer, prompt=\"Article using:\\n{research.output}\\n{trends.output}\"),\n)\nfor event in pipe.stream(topic=\"AI agents\"):\n    print(f\"[{event.payload.get('step', '')}] {event.message}\")\n```\n\n### Agent Teams + Channels + Memory + Benchmark\n\n```python\n# Agent team with LLM-routed coordination\nteam = AgentTeam(coordinator=llm, agents=[researcher, writer, reviewer])\nfor event in team.stream(\"Write a guide about async Python\"):\n    print(f\"[{event.payload.get('agent')}] {event.message}\")\n\n# Typed agent communication\nchannel = Channel(name=\"pipeline\")\nchannel.send(AgentMessage(from_agent=\"a\", to_agent=\"b\", type=\"data\", data={...}))\n\n# Advanced memory (conversation + semantic + entity)\nmemory = AgentMemory.default(llm=llm, embedding_fn=my_embed)\n\n# Systematic agent testing\nreport = AgentBenchmark(name=\"eval\", cases=[\n    TestCase(input=\"What is Docker?\", expected_contains=[\"container\"]),\n]).run(agent)\nprint(report.summary())\n```\n\n### Also in 1.0.2\n\n- **Parallel tool execution** — `parallel_tool_execution=True`\n- **Graceful tool failure** — errors become messages, not crashes\n- **Context window management** — token tracking + auto-compaction\n- **Hooks \u0026 middleware** — `@hooks.on_before_llm`, `@hooks.on_after_tool`\n- **Async runtime** — `AsyncAgentRuntime` for FastAPI\n- **Mid-run re-planning** — `replan_interval=N`\n- **Transient error auto-retry** — 429/500/503 retried automatically\n- **Output parsers** — JSON, Pydantic, Regex, Markdown\n\n---\n\n## 🚀 What's new in 1.0\n\n**SHIPIT Agent 1.0** is the first stable release. It ships a production-ready agent runtime built around three ideas: **every step is observable**, **every provider is interchangeable**, and **the runtime stays out of your way**. The headline features:\n\n- **🧠 Live reasoning / \"thinking\" events.** When the underlying model surfaces a reasoning block — OpenAI o-series (`o1`, `o3`, `o4`), `gpt-5`, DeepSeek R1, Anthropic Claude extended thinking, or AWS Bedrock `openai.gpt-oss-120b` — the runtime extracts it and emits `reasoning_started` / `reasoning_completed` events **before** the corresponding `tool_called` events. Your UI can render a live \"Thinking\" panel that matches what the model is actually doing under the hood, with no manual wiring. All three LLM adapters (direct OpenAI, direct Anthropic, LiteLLM/Bedrock) now share a common `reasoning_content` extraction helper that handles flat `reasoning_content` attributes, Anthropic-style `thinking_blocks`, and pydantic `model_dump()` fallbacks.\n- **⚡ Truly incremental streaming.** `agent.stream()` now runs the agent on a background worker thread and yields `AgentEvent` objects through a thread-safe queue as they are emitted by the runtime. No more \"everything arrives at once at the end\" — each `run_started`, `reasoning_completed`, `tool_called`, `tool_completed` event reaches your loop the instant it happens. Works in Jupyter, VS Code, JupyterLab, WebSocket/SSE packet transports, and plain terminals. Errors in the background worker are captured and re-raised on the consumer thread so nothing gets silently swallowed.\n- **🛡️ Bulletproof Bedrock tool pairing.** AWS Bedrock's Converse API enforces strict 1:1 pairing between `toolUse` blocks in an assistant turn and `toolResult` blocks in the next user turn. The 1.0 runtime guarantees this invariant everywhere: the planner output is injected as a `user`-role context message rather than an orphan `toolResult`; every `response.tool_calls` entry gets **either** a real tool-result **or** a synthetic error tool-result (for hallucinated tool names) so pairing never drifts; each call is stamped with a stable `call_{iteration}_{index}` ID that round-trips through the message metadata. Multi-iteration tool loops on Bedrock Claude, Bedrock gpt-oss, and Anthropic native all work reliably without `modify_params` band-aids.\n- **🔑 Zero-friction provider switching via `.env`.** `build_llm_from_env()` now walks upward from CWD to discover a `.env` file, so the same notebook or script works whether CWD is the repo root, a `notebooks/` subdirectory, or a deeply nested workspace. Switching providers is a one-line `.env` edit (`SHIPIT_LLM_PROVIDER=openai|anthropic|bedrock|gemini|vertex|litellm|groq|together|ollama`) — no kernel restarts, no code edits, no custom boot scripts. Providers are supported out of the box with credential validation that raises a helpful error pointing to the exact env var you forgot to set.\n- **🌐 In-process Playwright for `open_url`.** The built-in `open_url` tool now uses Playwright's Chromium directly (headless, realistic desktop UA, 1280×800 viewport, `en-US` locale) as its primary fetch path. Handles JS-rendered pages, anti-bot protections, and modern TLS/ALPN without depending on any external scraper service. Stdlib `urllib` is kept as a zero-dep fallback for static pages and environments without Playwright installed. No third-party HTTP libraries (no `httpx`, no `requests`, no `beautifulsoup4`) — just Playwright and the standard library. Errors never raise out of the tool: they come back as a normal `ToolOutput` with a `warnings` list in metadata, so the runtime's tool pairing stays balanced even when a target URL is down.\n- **🪵 Full event table for observability.** 14 distinct event types are emitted over the lifetime of a run: `run_started`, `mcp_attached`, `planning_started`, `planning_completed`, `step_started`, `reasoning_started`, `reasoning_completed`, `tool_called`, `tool_completed`, `tool_retry`, `tool_failed`, `llm_retry`, `interactive_request`, `run_completed` — each with a documented payload and a stable shape. The [Streaming Events](#streaming-events) section below has a complete reference and a 17-step example trace of a real Bedrock run.\n- **🔁 Iteration-cap summarization fallback.** If the model is still calling tools when the loop hits `max_iterations`, the runtime automatically gives it one more turn with `tools=[]` to force a natural-language summary, so consumers never see an empty final answer. The fallback is guarded with try/except so a summarization failure can't mask the rest of the run.\n- **🧩 Clean separation of concerns.** Runtime, tool registry, LLM adapters, MCP integration, memory, sessions, tracing, retry/router policies, and agent profiles are each one small module with a well-defined boundary. If you want to bring your own tool, your own LLM, your own MCP transport, or your own session store, you implement a single protocol and plug it in — no framework ceremony, no metaclasses, no hidden globals.\n\n### Core feature summary\n\n- bring your own LLM, or use any of the seven built-in provider adapters\n- attach Python tools as classes, as `FunctionTool` wrappers, or as connector-style third-party tools (Gmail, Google Drive, Slack, …)\n- attach local and remote MCP servers — HTTP, stdio subprocess, and persistent sessions all supported\n- use prebuilt tools like `web_search`, `open_url` (Playwright-backed), `ask_user`, `human_review`, `code_interpreter`, `file_editor`, and more\n- iterative multi-step tool loops with configurable `max_iterations` and an automatic summarization fallback\n- built-in retry policies for transient LLM and tool errors, with dedicated `llm_retry` / `tool_retry` events\n- memory store (in-memory or file-backed) for cross-turn facts, session store for conversation resumption, trace store for audit logs\n- support for **OpenAI, Anthropic, AWS Bedrock, Google Gemini, Groq, Together AI, Ollama, Vertex AI, OpenRouter**, and any other LiteLLM-backed provider\n- stream structured events through `agent.stream()`, `chat_session.stream_packets(transport=\"websocket\")`, or `chat_session.stream_packets(transport=\"sse\")`\n- inspect every step: reasoning, tool arguments, tool outputs, retries, iteration counts, final answer\n- compose reusable agent profiles with system prompts, tool selections, and policies locked in\n- ship with a strong default system prompt and router/retry policies that work out of the box\n- persistent file-backed session and memory stores for long-running, resumable agents\n- persistent MCP subprocess sessions with graceful shutdown on run completion\n\n## Install\n\nPublished package:\n\n```bash\npip install shipit-agent\n```\n\nLocal package install:\n\n```bash\npip install .\n```\n\nEditable development install:\n\n```bash\npip install -e .[dev]\n```\n\nIf you prefer `requirements.txt`:\n\n```bash\npip install -r requirements.txt\n```\n\nIf you use Poetry instead of pip:\n\n```bash\npoetry install\npoetry run pytest -q\n```\n\nPlaywright is optional. The default web search path uses `duckduckgo` and does not require browser binaries.\nIf you want browser-rendered search or page automation, install the extra and browser bundle:\n\n```bash\npip install -e .[playwright]\nplaywright install\n```\n\nLong-form documentation:\n\n- 🌐 **[Full documentation site](https://docs.shipiit.com/)** — MkDocs Material, searchable, versioned\n    - [Quick start](https://docs.shipiit.com/getting-started/quickstart/) · [Installation](https://docs.shipiit.com/getting-started/install/) · [Environment setup](https://docs.shipiit.com/getting-started/environment/)\n    - [Streaming events](https://docs.shipiit.com/guides/streaming/) · [Reasoning \u0026 thinking](https://docs.shipiit.com/guides/reasoning/) · [Tool search](https://docs.shipiit.com/guides/tool-search/)\n    - [Custom tools](https://docs.shipiit.com/guides/custom-tools/) · [MCP integration](https://docs.shipiit.com/guides/mcp/) · [Sessions \u0026 memory](https://docs.shipiit.com/guides/sessions/)\n    - [Architecture](https://docs.shipiit.com/reference/architecture/) · [Event types reference](https://docs.shipiit.com/reference/events/) · [Model adapters](https://docs.shipiit.com/reference/adapters/)\n- [Changelog](https://docs.shipiit.com/changelog/) — full v1.0 release notes\n- [docs.md](docs.md) — legacy flat-markdown docs (kept for offline browsing)\n- [TOOLS.md](TOOLS.md)\n- [SECURITY.md](SECURITY.md)\n- [LICENSE.md](LICENSE.md)\n\nEnvironment and examples:\n\n- [.env.example](.env.example)\n- [examples/run_multi_tool_agent.py](examples/run_multi_tool_agent.py)\n- [notebooks/shipit_agent_test_drive.ipynb](notebooks/shipit_agent_test_drive.ipynb)\n\nIf you did not see the notebook earlier, the current path is:\n\n- `notebooks/shipit_agent_test_drive.ipynb`\n\n## One Running Setup Example\n\nThis is the simplest high-power setup pattern for a real project. If you want a runnable script instead of an inline snippet, start from [examples/run_multi_tool_agent.py](examples/run_multi_tool_agent.py) and copy [.env.example](.env.example) to `.env`.\n\nThis setup gives you:\n\n- provider selection from environment variables\n- built-in tools plus a few local function tools\n- persistent memory, sessions, and traces\n- a clean place to add your own prompt, MCP servers, and connector credentials\n\n```python\nfrom shipit_agent import (\n    Agent,\n    CredentialRecord,\n    FileCredentialStore,\n    FileMemoryStore,\n    FileSessionStore,\n    FileTraceStore,\n)\nfrom shipit_agent.llms import BedrockChatLLM\n\ncredential_store = FileCredentialStore(\".shipit_workspace/credentials.json\")\ncredential_store.set(\n    CredentialRecord(\n        key=\"slack\",\n        provider=\"slack\",\n        secrets={\"token\": \"SLACK_BOT_TOKEN\"},\n    )\n)\n\nagent = Agent.with_builtins(\n    llm=BedrockChatLLM(model=\"bedrock/openai.gpt-oss-120b-1:0\"),\n    workspace_root=\".shipit_workspace\",\n    memory_store=FileMemoryStore(\".shipit_workspace/memory.json\"),\n    session_store=FileSessionStore(\".shipit_workspace/sessions\"),\n    trace_store=FileTraceStore(\".shipit_workspace/traces\"),\n    credential_store=credential_store,\n    session_id=\"project-agent\",\n    trace_id=\"project-agent-run\",\n)\n\nresult = agent.run(\"Research the task, use tools, and keep the project context.\")\nprint(result.output)\n```\n\n## Environment Setup For Scripts\n\nThe runnable example reads `.env` automatically. Start by copying the template:\n\n```bash\ncp .env.example .env\n```\n\nFor AWS Bedrock with the DRKCACHE-style model, set at least these values:\n\n```env\nSHIPIT_LLM_PROVIDER=bedrock\nSHIPIT_BEDROCK_MODEL=bedrock/openai.gpt-oss-120b-1:0\nAWS_REGION_NAME=us-east-1\nAWS_ACCESS_KEY_ID=your-access-key\nAWS_SECRET_ACCESS_KEY=your-secret-key\n```\n\nYou can also use `AWS_PROFILE` instead of inline AWS keys if your local AWS CLI profile is already configured. Other providers use their standard SDK environment variables, for example `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `GEMINI_API_KEY`, `GROQ_API_KEY`, or `TOGETHERAI_API_KEY`.\n\nFor Vertex AI with a service-account JSON file:\n\n```env\nSHIPIT_LLM_PROVIDER=vertex\nSHIPIT_VERTEX_MODEL=vertex_ai/gemini-1.5-pro\nSHIPIT_VERTEX_CREDENTIALS_FILE=/absolute/path/to/vertex-service-account.json\nVERTEXAI_PROJECT=your-gcp-project-id\nVERTEXAI_LOCATION=us-central1\n```\n\nThis automatically maps `SHIPIT_VERTEX_CREDENTIALS_FILE` to `GOOGLE_APPLICATION_CREDENTIALS` for LiteLLM/Vertex usage.\n\nFor a generic LiteLLM proxy or server:\n\n```env\nSHIPIT_LLM_PROVIDER=litellm\nSHIPIT_LITELLM_MODEL=openrouter/openai/gpt-4o-mini\nSHIPIT_LITELLM_API_BASE=http://localhost:4000\nSHIPIT_LITELLM_API_KEY=your-litellm-key\n```\n\nIf your LiteLLM route needs a custom provider hint, also set:\n\n```env\nSHIPIT_LITELLM_CUSTOM_PROVIDER=openrouter\n```\n\nFor web search, the default is now:\n\n```env\nSHIPIT_WEB_SEARCH_PROVIDER=duckduckgo\n```\n\nIf you want browser-backed search and browser automation, switch to:\n\n```env\nSHIPIT_WEB_SEARCH_PROVIDER=playwright\n```\n\nand install Playwright plus its browser bundle:\n\n```bash\npip install -e .[playwright]\nplaywright install\n```\n\nRun the example like this:\n\n```bash\npython examples/run_multi_tool_agent.py \"Search the web, inspect the workspace, and summarize the result.\"\n```\n\nEnable streaming events with:\n\n```bash\nSHIPIT_STREAM=1 python examples/run_multi_tool_agent.py \"Plan the work and explain each runtime step.\"\n```\n\nUse the notebook when you want an interactive setup and smoke-test workflow:\n\n```bash\njupyter notebook notebooks/shipit_agent_test_drive.ipynb\n```\n\n## Agent Diagnostics\n\nUse `agent.doctor()` to validate provider env, tool setup, MCP attachments, stores, and connector credentials before a real run.\n\n```python\nfrom shipit_agent import Agent\nfrom shipit_agent.llms import SimpleEchoLLM\n\nagent = Agent.with_builtins(llm=SimpleEchoLLM())\nreport = agent.doctor()\nprint(report.to_markdown())\n```\n\n## Project Chat Pattern\n\nFor app integration, keep one `Agent` instance tied to a `session_id` and call `run(...)` for each user message.\n\n```python\nfrom shipit_agent import Agent\nfrom shipit_agent.llms import BedrockChatLLM\n\nagent = Agent.with_builtins(\n    llm=BedrockChatLLM(model=\"bedrock/openai.gpt-oss-120b-1:0\"),\n    session_id=\"project-chat\",\n    workspace_root=\".shipit_workspace\",\n)\n\ndef chat(user_message: str) -\u003e str:\n    result = agent.run(user_message)\n    return result.output\n```\n\n## Streaming Packet Shape\n\n`agent.stream(...)` yields `AgentEvent` objects. Each event can be serialized with `event.to_dict()`.\n\nPacket shape:\n\n```python\n{\n    \"type\": \"tool_completed\",\n    \"message\": \"Tool completed: web_search\",\n    \"payload\": {\n        \"output\": \"...\",\n        \"iteration\": 1,\n    },\n}\n```\n\nExample:\n\n```python\nfor event in agent.stream(\"Research the issue and explain each step.\"):\n    print(event.to_dict())\n```\n\nCommon packet examples:\n\n`run_started`\n\n```python\n{\n    \"type\": \"run_started\",\n    \"message\": \"Agent run started\",\n    \"payload\": {\n        \"prompt\": \"Research the issue and explain each step.\"\n    },\n}\n```\n\n`tool_called`\n\n```python\n{\n    \"type\": \"tool_called\",\n    \"message\": \"Tool called: web_search\",\n    \"payload\": {\n        \"arguments\": {\"query\": \"latest incident response workflow\"},\n        \"iteration\": 1,\n    },\n}\n```\n\n`tool_completed`\n\n```python\n{\n    \"type\": \"tool_completed\",\n    \"message\": \"Tool completed: workspace_files\",\n    \"payload\": {\n        \"output\": \"Found 12 matching files...\",\n        \"iteration\": 1,\n    },\n}\n```\n\n`mcp_attached`\n\n```python\n{\n    \"type\": \"mcp_attached\",\n    \"message\": \"MCP server attached: docs\",\n    \"payload\": {\n        \"server\": \"docs\"\n    },\n}\n```\n\n`interactive_request`\n\n```python\n{\n    \"type\": \"interactive_request\",\n    \"message\": \"Interactive request from ask_user\",\n    \"payload\": {\n        \"kind\": \"ask_user\",\n        \"payload\": {\"interactive\": True, \"kind\": \"ask_user\"}\n    },\n}\n```\n\n`run_completed`\n\n```python\n{\n    \"type\": \"run_completed\",\n    \"message\": \"Agent run completed\",\n    \"payload\": {\n        \"output\": \"Final answer text here.\"\n    },\n}\n```\n\n`AgentResult` is also serializable with `result.to_dict()` if you want one final packet containing the full run.\n\nChat-session wrapper example:\n\n```python\nsession = agent.chat_session(session_id=\"project-chat\")\nreply = session.send(\"Summarize the current workspace.\")\n\nfor packet in session.stream_packets(\n    \"Plan the work and show packet updates.\",\n    transport=\"websocket\",\n):\n    print(packet)\n```\n\nSSE packet example:\n\n```python\nfor packet in session.stream_packets(\n    \"Explain the runtime in SSE packet form.\",\n    transport=\"sse\",\n):\n    print(packet)\n```\n\n## Quick Start\n\n```python\nfrom shipit_agent import Agent, AgentProfileBuilder, FunctionTool\nfrom shipit_agent.llms import SimpleEchoLLM\n\n\ndef add(a: int, b: int) -\u003e str:\n    return str(a + b)\n\n\nagent = (\n    AgentProfileBuilder(\"assistant\")\n    .description(\"General purpose assistant\")\n    .prompt(\"You are concise, accurate, and tool-aware.\")\n    .tool(FunctionTool.from_callable(add, name=\"add\"))\n    .build(llm=SimpleEchoLLM())\n)\n\nresult = agent.run(\"Hello\")\nprint(result.output)\n```\n\n## Default Built-In Agent\n\nIf you want a capable agent quickly, start here:\n\n```python\nfrom shipit_agent import Agent\nfrom shipit_agent.llms import SimpleEchoLLM\n\nagent = Agent.with_builtins(\n    llm=SimpleEchoLLM(),\n    name=\"shipit\",\n    description=\"General-purpose execution agent\",\n    workspace_root=\".shipit_workspace\",\n    web_search_provider=\"duckduckgo\",\n)\n\nresult = agent.run(\"Research the topic, plan the work, and save a summary.\")\nprint(result.output)\n```\n\n## Session History And Memory\n\nYou can keep context in two ways:\n\n- pass `history=[Message(...), ...]` to seed the agent with prior turns\n- use `session_store` plus `session_id` to persist history across runs\n\n```python\nfrom shipit_agent import Agent, InMemorySessionStore, Message\nfrom shipit_agent.llms import SimpleEchoLLM\n\nagent = Agent(\n    llm=SimpleEchoLLM(),\n    history=[\n        Message(role=\"user\", content=\"We are building an incident response workflow.\"),\n        Message(role=\"assistant\", content=\"Understood. I will keep the design focused on operations.\"),\n    ],\n    session_store=InMemorySessionStore(),\n    session_id=\"incident-workflow\",\n)\n```\n\n## Tool-Calling Example\n\n```python\nfrom shipit_agent import Agent, FunctionTool\nfrom shipit_agent.llms import LLMResponse\nfrom shipit_agent.models import ToolCall\n\n\nclass DemoLLM:\n    def complete(self, *, messages, tools=None, system_prompt=None, metadata=None):\n        return LLMResponse(\n            content=\"The tool has been executed.\",\n            tool_calls=[ToolCall(name=\"add\", arguments={\"a\": 2, \"b\": 3})],\n        )\n\n\ndef add(a: int, b: int) -\u003e str:\n    return str(a + b)\n\n\nagent = Agent(\n    llm=DemoLLM(),\n    prompt=\"You are a precise assistant.\",\n    tools=[FunctionTool.from_callable(add)],\n)\n\nresult = agent.run(\"Add 2 and 3\")\nprint(result.tool_results[0].output)\n```\n\n## Creating A New Tool\n\nThe simplest path is wrapping a normal Python callable:\n\n```python\nfrom shipit_agent import FunctionTool\n\n\ndef slugify(value: str) -\u003e str:\n    \"\"\"Convert a title to a simple slug.\"\"\"\n    return value.lower().replace(\" \", \"-\")\n\n\ntool = FunctionTool.from_callable(\n    slugify,\n    name=\"slugify\",\n    description=\"Turn text into a URL-friendly slug.\",\n)\n```\n\nIf you want full control over schema, output metadata, and prompt guidance, create a tool class:\n\n```python\nfrom shipit_agent.tools.base import ToolContext, ToolOutput\n\n\nclass WordCountTool:\n    name = \"count_words\"\n    description = \"Count the number of words in a string.\"\n    prompt = \"Use this when the user needs deterministic word counts.\"\n    prompt_instructions = \"Prefer this over estimating counts in prose.\"\n\n    def schema(self) -\u003e dict:\n        return {\n            \"type\": \"function\",\n            \"function\": {\n                \"name\": self.name,\n                \"description\": self.description,\n                \"parameters\": {\n                    \"type\": \"object\",\n                    \"properties\": {\n                        \"text\": {\"type\": \"string\", \"description\": \"Text to count\"},\n                    },\n                    \"required\": [\"text\"],\n                },\n            },\n        }\n\n    def run(self, context: ToolContext, **kwargs) -\u003e ToolOutput:\n        text = kwargs[\"text\"]\n        count = len(text.split())\n        return ToolOutput(\n            text=str(count),\n            metadata={\"word_count\": count},\n        )\n```\n\nThen attach it to an agent:\n\n```python\nfrom shipit_agent import Agent\nfrom shipit_agent.llms import SimpleEchoLLM\n\nagent = Agent(\n    llm=SimpleEchoLLM(),\n    tools=[WordCountTool()],\n)\n```\n\n## Core Concepts\n\n- `Agent`: public entrypoint\n- `LLM`: protocol adapter for any model provider\n- `Tool`: executable function with schema and structured output\n- `MCPServer`: wrapper for MCP-backed tool collections\n- `AgentProfileBuilder`: reusable builder for shipping presets\n\n## Prebuilt Tools\n\n```python\nfrom shipit_agent import (\n    Agent,\n    AskUserTool,\n    ArtifactBuilderTool,\n    HumanReviewTool,\n    GmailTool,\n    MemoryTool,\n    OpenURLTool,\n    PlaywrightBrowserTool,\n    PlannerTool,\n    PromptTool,\n    ToolSearchTool,\n    VerifierTool,\n    WebSearchTool,\n    WorkspaceFilesTool,\n)\nfrom shipit_agent.llms import SimpleEchoLLM\n\nagent = Agent(\n    llm=SimpleEchoLLM(),\n    prompt=\"You are a capable research agent.\",\n    tools=[\n        WebSearchTool(),\n        OpenURLTool(),\n        PlaywrightBrowserTool(),\n        AskUserTool(),\n        HumanReviewTool(),\n        MemoryTool(),\n        PlannerTool(),\n        PromptTool(),\n        VerifierTool(),\n        ToolSearchTool(),\n        ArtifactBuilderTool(),\n        WorkspaceFilesTool(),\n        GmailTool(),\n    ],\n)\n```\n\n### `tool_search` — let the agent discover its own tools\n\nWhen an agent has more than a handful of tools, two problems appear:\n\n1. **Token bloat** — every turn ships the full tool catalog to the LLM.\n2. **Tool hallucination** — similar tool names get confused and the model invents ones that don't exist.\n\n`ToolSearchTool` solves both. Give the model a plain-language query and it returns a **ranked shortlist** of the best-matching tools currently registered on the agent, with names, descriptions, usage hints, and relevance scores:\n\n```python\nfrom shipit_agent import Agent, ToolSearchTool\nfrom shipit_agent.llms import OpenAIChatLLM\n\nagent = Agent.with_builtins(llm=OpenAIChatLLM(model=\"gpt-4o-mini\"))\n# ToolSearchTool is included in Agent.with_builtins automatically.\n\nresult = agent.run(\n    \"Use tool_search to find the right tool for fetching a specific URL, \"\n    \"then call that tool on https://example.com\"\n)\n```\n\nThe model will first call `tool_search({\"query\": \"fetch a specific URL\", \"limit\": 3})` and receive something like:\n\n```\nBest tools for 'fetch a specific URL' (ranked by relevance):\n1. open_url (score=0.4217) — Fetch a URL and return a clean text excerpt.\n   ↳ when to use: Use this when you need exact content from a specific URL…\n2. playwright_browser (score=0.3104) — Drive a headless browser…\n   ↳ when to use: Use for pages requiring interaction or anti-bot protection.\n3. web_search (score=0.1875) — Search the web with a configurable provider…\n   ↳ when to use: Use when you need fresh information from the internet.\n```\n\nThen call `open_url` directly with the right arguments. No hallucinations, no wasted tokens on 27 irrelevant schemas.\n\n**Scoring algorithm** (pure stdlib, no embeddings, no external API):\n\n```\nscore = SequenceMatcher(query, haystack).ratio() + 0.12 × token_hits\n```\n\nwhere `haystack` concatenates each tool's `name`, `description`, and `prompt_instructions`, and `token_hits` counts how many query words appear literally. Tie-broken by insertion order. Results below `score=0.05` are filtered as noise.\n\n**Configurable knobs:** `max_limit` (hard cap, default 10), `default_limit` (default 5), `token_bonus` (default 0.12). Override at construction time:\n\n```python\nToolSearchTool(max_limit=15, default_limit=8, token_bonus=0.20)\n```\n\n## Using Multiple Tools In One Agent\n\nA practical pattern is to combine built-in tools with your own callable tools. The example script does exactly that: it wires `WebSearchTool`, `OpenURLTool`, `WorkspaceFilesTool`, `CodeExecutionTool`, and other built-ins together with local `FunctionTool` helpers like `project_context` and `add_numbers`.\n\n```python\nfrom shipit_agent import Agent, FunctionTool, get_builtin_tools\nfrom shipit_agent.llms import BedrockChatLLM\n\nllm = BedrockChatLLM(model=\"bedrock/openai.gpt-oss-120b-1:0\")\ntools = get_builtin_tools(llm=llm, workspace_root=\".shipit_workspace\")\ntools.append(FunctionTool.from_callable(add_numbers, name=\"add_numbers\"))\n\nagent = Agent(llm=llm, tools=tools)\n```\n\nYou can mix deterministic tools, built-in tools, and file/code tools together:\n\n```python\nfrom shipit_agent import Agent, CodeExecutionTool, FunctionTool, WebSearchTool, WorkspaceFilesTool\nfrom shipit_agent.llms import SimpleEchoLLM\n\n\ndef extract_keywords(text: str) -\u003e str:\n    words = [word.strip(\".,\").lower() for word in text.split()]\n    return \", \".join(sorted(set(word for word in words if len(word) \u003e 5)))\n\n\nagent = Agent(\n    llm=SimpleEchoLLM(),\n    tools=[\n        WebSearchTool(provider=\"duckduckgo\"),\n        WorkspaceFilesTool(root_dir=\".shipit_workspace\"),\n        CodeExecutionTool(workspace_root=\".shipit_workspace/code\"),\n        FunctionTool.from_callable(extract_keywords, name=\"extract_keywords\"),\n    ],\n)\n```\n\nThat setup lets one agent:\n\n- search the web\n- run local computation\n- save files\n- use your own deterministic helper functions\n\n## Web Search Provider Selection\n\n`WebSearchTool` accepts either a provider object or a provider name. The default provider is `duckduckgo` so the library works without extra browser setup.\n\n```python\nfrom shipit_agent import WebSearchTool\n\ndefault_search = WebSearchTool()\nduckduckgo_search = WebSearchTool(provider=\"duckduckgo\")\nplaywright_search = WebSearchTool(provider=\"playwright\")\nbrave_search = WebSearchTool(provider=\"brave\", api_key=\"BRAVE_API_KEY\")\nserper_search = WebSearchTool(provider=\"serper\", api_key=\"SERPER_API_KEY\")\ntavily_search = WebSearchTool(provider=\"tavily\", api_key=\"TAVILY_API_KEY\")\n```\n\nYou can also pass provider config:\n\n```python\nsearch = WebSearchTool(\n    provider=\"duckduckgo\",\n    provider_config={\"timeout\": 20.0},\n)\n```\n\nUse `playwright` only when JavaScript rendering matters:\n\n```python\nsearch = WebSearchTool(\n    provider=\"playwright\",\n    provider_config={\"timeout_ms\": 20000},\n)\n```\n\n## Default Agent Setup\n\n`Agent` now ships with a default system prompt, retry policy, and router policy, so this works without extra setup:\n\n```python\nfrom shipit_agent import Agent\nfrom shipit_agent.llms import SimpleEchoLLM\n\nagent = Agent.with_builtins(llm=SimpleEchoLLM())\nresult = agent.run(\"Research the problem, plan the work, and save a report.\")\n```\n\nYou can override policies without replacing the whole prompt:\n\n```python\nfrom shipit_agent import Agent, RetryPolicy, RouterPolicy\nfrom shipit_agent.llms import SimpleEchoLLM\n\nagent = Agent(\n    llm=SimpleEchoLLM(),\n    retry_policy=RetryPolicy(max_llm_retries=2, max_tool_retries=1),\n    router_policy=RouterPolicy(auto_plan=True, long_prompt_threshold=80),\n)\n```\n\n## Code Execution\n\n```python\nfrom shipit_agent import CodeExecutionTool\n\ntool = CodeExecutionTool()\nresult = tool.run(\n    context=type(\"Ctx\", (), {\"state\": {}})(),\n    language=\"python\",\n    code=\"print('hello from shipit')\",\n)\n```\n\nSupported interpreter families include `python`, `bash`, `sh`, `zsh`, `javascript`, `typescript`, `ruby`, `php`, `perl`, `lua`, and `r`, subject to the interpreter being installed locally.\n\nExample with file generation:\n\n```python\nfrom shipit_agent import Agent, CodeExecutionTool, WorkspaceFilesTool\nfrom shipit_agent.llms import SimpleEchoLLM\n\nagent = Agent(\n    llm=SimpleEchoLLM(),\n    tools=[\n        CodeExecutionTool(workspace_root=\".shipit_workspace/code\"),\n        WorkspaceFilesTool(root_dir=\".shipit_workspace\"),\n    ],\n)\n```\n\n## MCP Discovery\n\n```python\nfrom shipit_agent import Agent, RemoteMCPServer, MCPHTTPTransport\nfrom shipit_agent.llms import SimpleEchoLLM\n\nmcp = RemoteMCPServer(\n    name=\"docs\",\n    transport=MCPHTTPTransport(\"http://localhost:8080/mcp\"),\n)\n\nagent = Agent.with_builtins(\n    llm=SimpleEchoLLM(),\n    mcps=[mcp],\n)\n```\n\nYou can also use subprocess transport for local MCP servers:\n\n```python\nfrom shipit_agent import PersistentMCPSubprocessTransport, RemoteMCPServer\n\nmcp = RemoteMCPServer(\n    name=\"local_docs\",\n    transport=PersistentMCPSubprocessTransport([\"python\", \"my_mcp_server.py\"]),\n)\n```\n\n## Gmail And Third-Party Tools\n\n`shipit_agent` now has a connector-style credential layer so tools like Gmail can be added cleanly instead of embedding credentials directly inside each tool.\n\n```python\nfrom shipit_agent import Agent, CredentialRecord, FileCredentialStore, GmailTool\nfrom shipit_agent.llms import SimpleEchoLLM\n\ncredential_store = FileCredentialStore(\".shipit_workspace/credentials.json\")\ncredential_store.set(\n    CredentialRecord(\n        key=\"gmail\",\n        provider=\"gmail\",\n        secrets={\n            \"access_token\": \"ACCESS_TOKEN\",\n            \"refresh_token\": \"REFRESH_TOKEN\",\n            \"client_id\": \"CLIENT_ID\",\n            \"client_secret\": \"CLIENT_SECRET\",\n        },\n    )\n)\n\nagent = Agent(\n    llm=SimpleEchoLLM(),\n    credential_store=credential_store,\n    tools=[GmailTool()],\n)\n```\n\nThis same pattern can be reused for:\n\n- Google Calendar\n- Google Drive\n- Slack\n- Linear\n- Jira\n- Notion\n- Confluence\n- custom internal APIs\n\n## Using Tools And MCP Together\n\nOne agent can combine built-in tools, custom tools, and remote MCP capabilities at the same time:\n\n```python\nfrom shipit_agent import Agent, MCPHTTPTransport, RemoteMCPServer, WebSearchTool, WorkspaceFilesTool\nfrom shipit_agent.llms import SimpleEchoLLM\n\nmcp = RemoteMCPServer(\n    name=\"design_system\",\n    transport=MCPHTTPTransport(\"http://localhost:8080/mcp\"),\n)\n\nagent = Agent(\n    llm=SimpleEchoLLM(),\n    tools=[\n        WebSearchTool(provider=\"duckduckgo\"),\n        WorkspaceFilesTool(root_dir=\".shipit_workspace\"),\n    ],\n    mcps=[mcp],\n)\n```\n\nThat lets the runtime choose between:\n\n- local tools\n- remote MCP tools\n- your own custom tools\n\n## Streaming Events\n\nUse `stream()` when you want step-by-step runtime events:\n\n```python\nfrom shipit_agent import Agent\nfrom shipit_agent.llms import SimpleEchoLLM\n\nagent = Agent.with_builtins(llm=SimpleEchoLLM())\n\nfor event in agent.stream(\"Investigate this problem and use tools if needed.\"):\n    print(event.type, event.message, event.payload)\n```\n\nTypical events include:\n\n| Event type | When it fires | Key payload fields |\n|---|---|---|\n| `run_started` | Very first event of a run, once per `stream()`/`run()` call. | `prompt` |\n| `mcp_attached` | Once per attached MCP server, right after `run_started`. | `server` |\n| `planning_started` | The router policy decided the prompt is complex enough to invoke the `plan_task` tool. Fires **before** the first LLM call. | `prompt` |\n| `planning_completed` | Planner returned. The plan is injected into the message history as a `user`-role context message so Bedrock tool pairing stays intact. | `output` |\n| `step_started` | Each iteration of the tool loop, right before calling the LLM. | `iteration`, `tool_count` |\n| `reasoning_started` | 🧠 The LLM response contained a thinking/reasoning block (OpenAI o-series, `gpt-oss`, Claude extended thinking, DeepSeek R1…). Fires **once per iteration** when reasoning is present. | `iteration` |\n| `reasoning_completed` | Immediately after `reasoning_started`, carrying the full reasoning text. Use this to render a \"Thinking\" panel in your UI. | `iteration`, `content` |\n| `tool_called` | The model decided to call a tool. Fires before execution. | `iteration`, `arguments` (`ev.message` is `\"Tool called: \u003cname\u003e\"`) |\n| `tool_completed` | Tool finished successfully. | `iteration`, `output` |\n| `tool_retry` | Transient tool failure, retry scheduled by `RetryPolicy`. | `iteration`, `attempt`, `error` |\n| `tool_failed` | Tool raised a non-retryable error, **or** the model hallucinated a tool name that isn't registered. In the second case a synthetic error tool-result is still appended so tool_use/tool_result pairing stays balanced (required by Bedrock Converse). | `iteration`, `error` |\n| `llm_retry` | Transient LLM provider error, retry scheduled. | `attempt`, `error` |\n| `interactive_request` | A tool returned `metadata.interactive=True` (e.g. `ask_user`, human review). UI can pause and collect input. | `kind`, `payload` |\n| `run_completed` | Final event, emitted once no more tool calls are requested. | `output` |\n\n### Reasoning / \"thinking\" steps\n\nWhen the underlying model surfaces reasoning content (OpenAI o-series via `reasoning_effort`, Anthropic Claude via extended thinking, AWS Bedrock `openai.gpt-oss-120b` via native reasoning, DeepSeek R1, etc.) the runtime automatically extracts it from the provider response and emits a `reasoning_started` + `reasoning_completed` pair before any subsequent `tool_called` events. The LiteLLM adapter handles three shapes:\n\n1. **Flat `reasoning_content`** on the response message (OpenAI / gpt-oss / DeepSeek via LiteLLM).\n2. **Anthropic `thinking_blocks[*].thinking`** (Claude extended thinking).\n3. **`model_dump()` fallback** — any `reasoning_content` / `thinking_blocks` key found in the pydantic dump.\n\nNo extra configuration is needed — `LLMResponse.reasoning_content` is populated automatically, and the runtime emits the events whenever it's non-empty. Models that don't expose reasoning simply won't produce these events; no error, no warning.\n\nA typical Bedrock `gpt-oss-120b` run with two tool calls produces:\n\n```\n1.  run_started        → \"Agent run started\"\n2.  planning_started   → \"Planner started\"\n3.  planning_completed → \"Planner completed\"\n4.  step_started       → iteration=1, tool_count=28\n5.  reasoning_started  → 🧠 iteration=1\n6.  reasoning_completed→ 🧠 \"The user wants two independent live BTC prices. I'll start with web_search...\"\n7.  tool_called        → \"Tool called: web_search\"\n8.  tool_completed     → \"Tool completed: web_search\"\n9.  step_started       → iteration=2\n10. reasoning_completed→ 🧠 \"Now I'll open both URLs to confirm...\"\n11. tool_called        → \"Tool called: open_url\"\n12. tool_completed     → \"Tool completed: open_url\"\n13. tool_called        → \"Tool called: open_url\"\n14. tool_completed     → \"Tool completed: open_url\"\n15. step_started       → iteration=3\n16. reasoning_completed→ 🧠 \"Both sources agree within $40...\"\n17. run_completed      → final markdown report\n```\n\nBecause `stream()` runs the agent on a background thread and pushes events through a queue, every event is yielded **as it happens** — your UI can render the thinking block, then the tool calls, then the final answer incrementally (see `notebooks/04_agent_streaming_packets.ipynb` for a live example).\n\n### Tool lifecycle \u0026 Bedrock tool pairing\n\nAWS Bedrock's Converse API enforces strict 1:1 pairing between `toolUse` blocks in an assistant turn and `toolResult` blocks in the next user turn. The runtime guarantees this invariant:\n\n- Every `response.tool_calls` entry gets **either** a successful tool-result message **or** a synthetic error tool-result (for hallucinated/unregistered tool names) — never a dropped orphan.\n- The planner output is injected as a regular `user`-role context message rather than a `tool`-role message, so it never appears as an unpaired `toolResult`.\n- Each tool call is stamped with a stable `call_{iteration}_{index}` ID that round-trips through `Message.metadata.tool_calls[i].id` ↔ `Message.metadata.tool_call_id` on the result.\n\nThis means multi-iteration tool loops work reliably on Bedrock Claude, Bedrock gpt-oss, and Anthropic native — no `modify_params` band-aids required.\n\n## End-To-End Example\n\nThis is a more realistic setup for a project agent:\n\n```python\nfrom shipit_agent import Agent, MCPHTTPTransport, RemoteMCPServer\nfrom shipit_agent.llms import OpenAIChatLLM\n\nmcp = RemoteMCPServer(\n    name=\"project_docs\",\n    transport=MCPHTTPTransport(\"http://localhost:8080/mcp\"),\n)\n\nagent = Agent.with_builtins(\n    llm=OpenAIChatLLM(model=\"gpt-4o-mini\"),\n    mcps=[mcp],\n    workspace_root=\".shipit_workspace\",\n    web_search_provider=\"brave\",\n    web_search_api_key=\"BRAVE_API_KEY\",\n    metadata={\n        \"workspace_root\": \".shipit_workspace\",\n        \"artifact_workspace_root\": \".shipit_workspace/artifacts\",\n    },\n)\n\nresult = agent.run(\n    \"Research the latest approach, inspect remote docs through MCP, \"\n    \"write a summary file, and generate a final artifact.\"\n)\n\nprint(result.output)\n```\n\nTool layout:\n\n- `shipit_agent/tools/open_url/open_url_tool.py`\n- `shipit_agent/tools/web_search/providers.py`\n- `shipit_agent/tools/web_search/web_search_tool.py`\n- `shipit_agent/tools/ask_user/ask_user_tool.py`\n- `shipit_agent/tools/human_review/human_review_tool.py`\n- `shipit_agent/tools/prompt/prompt_tool.py`\n- `shipit_agent/tools/verifier/verifier_tool.py`\n- `shipit_agent/tools/sub_agent/sub_agent_tool.py`\n- `shipit_agent/tools/tool_search/tool_search_tool.py`\n- `shipit_agent/tools/artifact_builder/artifact_builder_tool.py`\n- `shipit_agent/tools/code_execution/code_execution_tool.py`\n- `shipit_agent/tools/playwright_browser/playwright_browser_tool.py`\n- `shipit_agent/tools/memory/memory_tool.py`\n- `shipit_agent/tools/planner/planner_tool.py`\n- `shipit_agent/tools/workspace_files/workspace_files_tool.py`\n\n## Model Adapters\n\n- `shipit_agent.llms.OpenAIChatLLM`\n- `shipit_agent.llms.AnthropicChatLLM`\n- `shipit_agent.llms.LiteLLMChatLLM`\n- `shipit_agent.llms.BedrockChatLLM`\n- `shipit_agent.llms.GeminiChatLLM`\n- `shipit_agent.llms.GroqChatLLM`\n- `shipit_agent.llms.TogetherChatLLM`\n- `shipit_agent.llms.OllamaChatLLM`\n\nThese adapters use optional dependencies and raise a clear error if the provider SDK is not installed.\n\nExample:\n\n```python\nfrom shipit_agent import Agent\nfrom shipit_agent.llms import BedrockChatLLM, GeminiChatLLM, LiteLLMChatLLM, OpenAIChatLLM, VertexAIChatLLM\n\nopenai_agent = Agent(llm=OpenAIChatLLM(model=\"gpt-4o-mini\"))\nbedrock_agent = Agent(llm=BedrockChatLLM())\ngemini_agent = Agent(llm=GeminiChatLLM())\nvertex_agent = Agent(llm=VertexAIChatLLM(model=\"vertex_ai/gemini-1.5-pro\"))\ngeneric_agent = Agent(llm=LiteLLMChatLLM(model=\"groq/llama-3.3-70b-versatile\"))\n```\n\n## State\n\n- `InMemoryMemoryStore`\n- `InMemorySessionStore`\n\nThe runtime can persist messages across runs with `session_id` and store tool outputs as memory facts.\n\n## Runtime Features\n\n- default system prompt via `DEFAULT_AGENT_PROMPT`\n- retry policy via `RetryPolicy`\n- auto-planning router via `RouterPolicy`\n- remote MCP discovery and transport adapters\n- artifact export to files\n\n## Status\n\nThis is a growing standalone agent runtime with built-in tools, remote MCP support, stronger runtime policies, and provider adapters.\n\n---\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"shipit-icon.svg\" alt=\"SHIPIT\" width=\"40\" height=\"40\" /\u003e\n  \u003cbr /\u003e\n  \u003cstrong\u003eBuilt with LOve. Powered by your choice of AI models.\u003c/strong\u003e\n  \u003cbr /\u003e\n  \u003csub\u003eShip it fast. Ship it right.\u003c/sub\u003e\n\u003c/p\u003e\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshipiit%2Fshipit_agent","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshipiit%2Fshipit_agent","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshipiit%2Fshipit_agent/lists"}