https://github.com/sukethrp/agentos

"The Operating System for AI Agents. Build, Test, Deploy, Monitor, Govern."
https://github.com/sukethrp/agentos

agent-framework agent-monitoring agent-testing agentos ai-agent-platform ai-agents anthropic fastapi governance llm ollama openai python rag safety

Last synced: 3 months ago
JSON representation

"The Operating System for AI Agents. Build, Test, Deploy, Monitor, Govern."

Host: GitHub
URL: https://github.com/sukethrp/agentos
Owner: sukethrp
Created: 2026-02-06T05:05:50.000Z (5 months ago)
Default Branch: main
Last Pushed: 2026-03-21T21:30:36.000Z (4 months ago)
Last Synced: 2026-03-22T07:42:41.590Z (3 months ago)
Topics: agent-framework, agent-monitoring, agent-testing, agentos, ai-agent-platform, ai-agents, anthropic, fastapi, governance, llm, ollama, openai, python, rag, safety
Language: Python
Homepage: https://agentos-mocha.vercel.app
Size: 825 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- Code of conduct: CODE_OF_CONDUCT.md

Awesome Lists containing this project

README

          


  
🤖 AgentOS

  The Operating System for AI Agents

  Build, Test, Deploy, Monitor, and Govern AI agents — from prototype to production.




  

  

  

  





  🌐 Live Demo ·

  🚀 Quick Start ·

  📋 Issues





  



> **For teams who need to deploy AI agents with testing, governance, and monitoring built in — not bolted on.**

## 3 Differentiators

- 🧪 **Test**: Run scenario-based simulation before deploy, with quality and cost scoring.

- 🛡️ **Govern**: Enforce budgets, permissions, and kill-switch policies with auditability.

- 📊 **Monitor**: Observe live agent runs, tool usage, latency, and spend in one dashboard.

## Quick Start

```bash

pip install agentos-platform

```

10-line example:

```python

from agentos.governed_agent import GovernedAgent

from agentos.core.tool import tool

@tool(description="Add two numbers")

def add(a: float, b: float) -> float:

    return a + b

agent = GovernedAgent(name="demo", model="gpt-4o-mini", tools=[add])

print(agent.run("What is 12.5 + 7.5?"))

```

Demo mode:

```bash

AGENTOS_DEMO_MODE=true python examples/run_web_builder.py

```

## Features

### MCP server with stdio/SSE transport (Claude Desktop + Cursor)

Install the MCP extra:

```bash

pip install 'agentos-platform[mcp]'

```

### 1) Start the MCP server

Expose built-in AgentOS tools (stdio transport is the safest choice for MCP clients like Claude Desktop and Cursor):

```bash

agentos mcp serve --transport stdio

```

Expose tools from a specific agent module (example `./my_agent/agent.py`):

```bash

agentos mcp serve --transport stdio --agent ./my_agent

```

Optional: run the HTTP SSE transport for clients that support it:

```bash

agentos mcp serve --transport sse --host 127.0.0.1 --port 8080

```

### 2) Configure Claude Desktop

Add the following snippet to your `claude_desktop_config.json` (restart Claude Desktop after editing):

```json

{

  "mcpServers": {

    "agentos": {

      "command": "agentos",

      "args": ["mcp", "serve", "--transport", "stdio"]

    }

  }

}

```

If you want a specific agent module:

```json

{

  "mcpServers": {

    "agentos": {

      "command": "agentos",

      "args": ["mcp", "serve", "--transport", "stdio", "--agent", "/absolute/path/to/agent.py"]

    }

  }

}

```

### 3) Configure Cursor

Add to Cursor `.cursor/mcp.json`:

```json

{

  "mcpServers": {

    "agentos": {

      "command": "agentos",

      "args": ["mcp", "serve", "--transport", "stdio"]

    }

  }

}

```

### Agent delegation (delegate tool + SharedContext + chaining)

AgentOS includes a structured delegation system that lets a “parent” agent offload subtasks to “child” agents while propagating rich context through a shared, in-memory key/value store.

Key pieces:

- `delegate_subtask` tool: LLM-facing tool that accepts structured fields like `task`, `context_json`, `constraints_json`, `expected_output_schema_json`, and `timeout`.

- `SharedContext`: a key/value store child agents can read/write during the delegation chain (avoids lossy prompt compression).

- Delegation chaining: if a child agent delegates again, the same shared context key is reused automatically.

Minimal wiring example:

```python

from agentos.core.agent import Agent

from agentos.core.delegation import DelegationManager

# Define your child agents however you like.

child_agent_a = Agent(name="child-a", model="gpt-4o-mini", tools=[])

child_agent_b = Agent(name="child-b", model="gpt-4o-mini", tools=[])

manager = DelegationManager()

manager.register_agent("child-a", child_agent_a)

manager.register_agent("child-b", child_agent_b)

# Create your parent agent and attach the delegate tool.

parent = Agent(name="parent", model="gpt-4o-mini", tools=[])

manager.attach_delegate_tool(parent)  # adds `delegate_subtask` to the toolset

# Now the parent agent can call `delegate_subtask`.

parent.run("Delegate a subtask and use shared context for details.")

```

SharedContext tools available to delegated agents:

- `shared_context_key()`

- `shared_context_get(key)`

- `shared_context_set(key, value_json)`

- `shared_context_dump()`

## Core Modules

| Module | What it does |

|--------|---------------|

| Agent SDK | Define agents and tools with provider-agnostic model routing |

| Simulation Sandbox | Test scenarios with LLM-as-judge quality and pass/fail scoring |

| Governance Engine | Budget controls, permissions, kill switch, and audit logging |

| Live Dashboard | Real-time traces for prompts, tool calls, latency, and spend |

| RAG Pipeline | Ingest, chunk, embed, and retrieve knowledge sources |

| Workflow Engine | Compose repeatable multi-step agent workflows |

📋 Full 15-module list (click to expand)

| Module | Description |

|--------|-------------|

| Agent SDK | Core governed agent runtime and tool-calling loop |

| WebSocket Streaming | Token streaming and low-latency interactive sessions |

| RAG Pipeline | Ingestion, chunking, embeddings, retrieval, and reranking |

| Simulation Sandbox | Scenario simulation, scoring, and comparison reports |

| Live Dashboard | Event stream, usage analytics, and operational visibility |

| Governance Engine | Guardrails, budget caps, permission checks, and audits |

| Agent Scheduler | Interval and cron scheduling with execution history |

| Event Bus | Trigger-driven orchestration via internal and external events |

| Plugin System | Runtime-extensible tools, providers, and adapters |

| Authentication | API key auth, org and user usage tracking, and middleware |

| A/B Testing | Side-by-side evaluation for variants and prompt changes |

| Workflow Engine | DAG-based execution with retries and branching |

| Multimodal | Vision and document flows for image and file-aware agents |

| Marketplace | Template registry for reusable agents and workflows |

| Embed SDK | Embeddable widget and integration surface for web apps |

## Honest Comparison

| Capability | AgentOS | LangChain | CrewAI | AutoGen |

|------------|---------|-----------|--------|---------|

| Built-in testing sandbox | ✅ Native | ❌ External setup | ❌ External setup | ❌ External setup |

| Governance (budget/kill switch) | ✅ Native | ⚠️ Custom code | ⚠️ Custom code | ⚠️ Custom code |

| Real-time ops dashboard | ✅ Native | ⚠️ LangSmith add-on | ❌ | ❌ |

| Batteries-included platform | ✅ Yes | ⚠️ Framework-first | ⚠️ Orchestration-first | ⚠️ Research-first |

| Ecosystem maturity | 🌱 Growing | ✅ Very mature | ✅ Mature | ✅ Mature |

## Benchmarks

See [full benchmark results](docs/benchmarks.md). Key findings:

- Our weighted evaluation ensemble correlates 0.91 with human judgment

- Local embeddings achieve 95% of OpenAI quality at zero cost

- Governance adds <5ms overhead to any query

## Architecture

See the architecture diagram above and `docs/` for component-level details and ADRs.

## Project Structure

```text

agentos/

├── src/agentos/      # Core platform modules

├── frontend/         # React frontend

├── dashboard/        # Web dashboard UI

├── deploy/helm/      # Helm charts

├── examples/         # Runnable examples

├── tests/            # Unit and integration tests

└── docs/             # Docs and ADRs

```

## Contributing

Contributions are welcome: [CONTRIBUTING.md](CONTRIBUTING.md)

## Roadmap

Roadmap and upcoming work are tracked in [GitHub Issues](https://github.com/sukethrp/agentos/issues).

- [ ] Agent-to-Agent mesh protocol

- [x] MCP server with stdio/SSE transport

- [x] Agent-to-agent delegation with shared context

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/sukethrp/agentos

Awesome Lists containing this project

README

🤖 AgentOS