https://github.com/polos-dev/polos
AI agents write code, run commands, and delete files - autonomously. Polos gives them isolated sandboxes with built-in tools (shell, file system, web search), approval flows that reach you on various channels, and durable execution with automatic retries, prompt caching, and concurrency control. Agents get full power. Your systems stay safe.
https://github.com/polos-dev/polos
agent-orchestration agentic-ai ai-agents ai-observability developer-tools durable-execution human-in-the-loop python sandbox typescript
Last synced: 4 months ago
JSON representation
AI agents write code, run commands, and delete files - autonomously. Polos gives them isolated sandboxes with built-in tools (shell, file system, web search), approval flows that reach you on various channels, and durable execution with automatic retries, prompt caching, and concurrency control. Agents get full power. Your systems stay safe.
- Host: GitHub
- URL: https://github.com/polos-dev/polos
- Owner: polos-dev
- License: apache-2.0
- Created: 2026-01-23T21:22:02.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2026-02-16T07:09:22.000Z (4 months ago)
- Last Synced: 2026-02-16T09:50:28.693Z (4 months ago)
- Topics: agent-orchestration, agentic-ai, ai-agents, ai-observability, developer-tools, durable-execution, human-in-the-loop, python, sandbox, typescript
- Language: TypeScript
- Homepage: https://polos.dev
- Size: 2.05 MB
- Stars: 19
- Watchers: 0
- Forks: 1
- Open Issues: 13
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
Durable execution platform for AI agents
AI agents that survive crashes, resume mid-execution, and pause for human approval - with zero manual checkpointing, retry logic, or queue management.
⭐ Star us to support the project!
---
Polos is a durable execution platform for AI agents. It provides the stateful infrastructure required to run long-running, autonomous agents reliably at scale, including a **built-in event system** for agent coordination, so you don't need to bolt on Kafka or RabbitMQ.
Write it all in plain Python or TypeScript. No DAGs to define, no graph syntax to learn. Use loops, conditionals, and function calls naturally while Polos handles durability, reliability and scaling automatically.
**Python**
```python
from polos import Agent, workflow, WorkflowContext
order_agent = Agent(
provider="openai",
model="gpt-4o",
tools=[check_inventory, calculate_shipping]
)
@workflow(trigger_on_event="order/new")
async def process_order(ctx: WorkflowContext, order: ProcessOrderInput):
# Agent validates order and checks inventory
validation = await ctx.step.agent_invoke_and_wait(
"validate_order",
order_agent.with_input(f"Validate this order: {order}")
)
if not validation.result.valid:
return ProcessOrderOutput(status="invalid", reason=validation.result.reason)
# High-value orders need approval - suspend until human decides
if order.amount > 1000:
decision = await ctx.step.suspend(
"approval",
data={"id": order.id, "amount": order.amount, "items": order.items}
)
if not decision.data["approved"]:
return ProcessOrderOutput(status="rejected")
# Charge customer (exactly-once guarantee)
payment = await ctx.step.run("charge", charge_stripe, order)
# Wait for warehouse pickup (could be hours or days)
await ctx.step.wait_for_event("wait_pickup", topic=f"warehouse.pickup/{order.id}")
# Send shipping notification
await ctx.step.run("notify", send_shipping_email, order)
return ProcessOrderOutput(status="completed", payment_id=payment.id)
```
**TypeScript**
```typescript
import { defineAgent, defineWorkflow } from "@polos/sdk";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";
const orderAgent = defineAgent({
id: "order-agent",
model: openai("gpt-4o"),
tools: [checkInventory, calculateShipping],
});
const processOrder = defineWorkflow(
{
id: "process-order",
triggerOnEvent: "order/new",
payloadSchema: z.object({
id: z.string(),
amount: z.number(),
items: z.array(z.string()),
}),
},
async (ctx, order) => {
// Agent validates order and checks inventory
const validation = await ctx.step.agentInvokeAndWait(
"validate_order",
orderAgent.withInput(`Validate this order: ${JSON.stringify(order)}`)
);
if (!validation.result.valid) {
return { status: "invalid", reason: validation.result.reason };
}
// High-value orders need approval - suspend until human decides
if (order.amount > 1000) {
const decision = await ctx.step.suspend("approval", {
data: { id: order.id, amount: order.amount, items: order.items },
});
if (!decision.data.approved) {
return { status: "rejected" };
}
}
// Charge customer (exactly-once guarantee)
const payment = await ctx.step.run("charge", () => chargeStripe(order));
// Wait for warehouse pickup (could be hours or days)
await ctx.step.waitForEvent("wait_pickup", {
topic: `warehouse.pickup/${order.id}`,
});
// Send shipping notification
await ctx.step.run("notify", () => sendShippingEmail(order));
return { status: "completed", paymentId: payment.id };
}
);
```
This workflow survives crashes, resumes mid-execution, and pauses for approval - automatically.
---
## Architecture
Polos consists of three components:
- **Orchestrator**: Manages execution state, handles retries, and coordinates workers
- **Worker**: Runs your agents and workflows, connects to the orchestrator
- **SDK**: Python and TypeScript libraries for defining agents, workflows, and tools
---
## See It In Action
Imagine a workflow that charges a customer, then pauses for a human fraud review. In most frameworks, a server restart during that 24-hour wait would lose the state - or worse, re-run the charge on reboot. Polos guarantees exactly-once durable execution.
**Timeline of what's happening:**
1. `charge_stripe` runs → Polos checkpoints the execution result
2. Workflow suspends for fraud review → Worker resources freed
3. Worker 1 crashes during the wait
4. Fraud team approves → Signal sent to orchestrator
5. Worker 2 resumes on a different machine → Stripe is **not** called again, result replayed from the log guaranteeing exactly-once execution
6. Confirmation email sent → workflow completes
Polos handles failures, rescheduling, and checkpointing. You just focus on business logic.
---
## Why Polos?
| Feature | Description |
|---------|-------------|
| **🧠 Durable State** | Your agent survives crashes with call stack and local variables intact. Step 18 of 20 fails? Resume from step 18. No wasted LLM calls. |
| **🚦 Global Concurrency** | System-wide rate limiting with queues and concurrency keys. Prevent one rogue agent from exhausting your entire OpenAI quota. |
| **🤝 Human-in-the-Loop** | Native support for pausing execution. Wait hours or days for user approval and resume with full context. Paused agents consume zero compute. |
| **📡 Agent Handoffs** | Transactional memory for multi-agent systems. Pass reasoning history between specialized agents without context drift. |
| **🔍 Decision-Level Observability** | Trace the reasoning behind every tool call, not just raw logs. See why your agent chose Tool B over Tool A. |
| **⚡ Production Ready** | Automatic retries, exactly-once execution guarantees, OpenTelemetry tracing built-in. |
### Logic Belongs in Code, Not Configs
**With Polos:**
**Python**
```python
@workflow
async def process_order(ctx: WorkflowContext, order: ProcessOrderInput):
if order.amount > 1000:
approved = await ctx.step.suspend("approval", data=order.model_dump())
if not approved.data["ok"]:
return {"status": "rejected"}
await ctx.step.run("charge", charge_stripe, order)
await ctx.step.run("notify", send_email, order)
```
**TypeScript**
```typescript
const processOrder = defineWorkflow({ id: "process-order" }, async (ctx, order) => {
if (order.amount > 1000) {
const approved = await ctx.step.suspend("approval", { data: order });
if (!approved.data.ok) {
return { status: "rejected" };
}
}
await ctx.step.run("charge", () => chargeStripe(order));
await ctx.step.run("notify", () => sendEmail(order));
});
```
**Other platforms:**
```python
dag = DAG(
nodes=[
Node("check_amount", CheckAmount),
Node("approval", HumanApproval),
Node("charge", ChargeStripe),
Node("notify", SendEmail),
],
edges=[
("check_amount", "approval", condition="amount > 1000"),
("check_amount", "charge", condition="amount <= 1000"),
("approval", "charge", condition="approved"),
("charge", "notify"),
]
)
```
No DAGs. No graph syntax. Just Python or TypeScript.
---
## Quick Start
### 1. Install Polos Server
```bash
curl -fsSL https://install.polos.dev/install.sh | bash
polos-server start
```
Copy the project ID displayed when you start the server. You'll need it in the next steps.
### 2. Install the SDK
**Python**
```bash
pip install polos-sdk
```
**TypeScript**
```bash
npm install @polos/sdk
```
### 3. Create your first agent
**Python**
```python
# worker.py
from polos import Agent, Worker, PolosClient
weather_agent = Agent(
id="weather_agent",
provider="openai",
model="gpt-4o-mini",
system_prompt="You are a helpful weather assistant.",
tools=[get_weather],
)
client = PolosClient(project_id="your-project-id")
worker = Worker(client=client, agents=[weather_agent])
if __name__ == "__main__":
import asyncio
asyncio.run(worker.run())
```
**TypeScript**
```typescript
// worker.ts
import { defineAgent, PolosClient, Worker } from "@polos/sdk";
import { openai } from "@ai-sdk/openai";
const weatherAgent = defineAgent({
id: "weather-agent",
model: openai("gpt-4o-mini"),
systemPrompt: "You are a helpful weather assistant.",
tools: [getWeather],
});
const client = new PolosClient({ projectId: "your-project-id" });
const worker = new Worker({ client, agents: [weatherAgent] });
await worker.run();
```
### 4. Run your agent
**Python**
```bash
# Terminal 1: Start the worker
python worker.py
# Terminal 2: Invoke the agent
python main.py
```
**TypeScript**
```bash
# Terminal 1: Start the worker
npx tsx worker.ts
# Terminal 2: Invoke the agent
npx tsx main.ts
```
### 5. See it in action
Open the Polos UI to see your agent's execution trace, tool calls, and reasoning:
📖 **[Full Quick Start Guide →](https://docs.polos.dev/quickstart)**
---
## Examples
### Agents
| Example | Python | TypeScript | Description |
|---------|--------|------------|-------------|
| Agent with tools | [Python](./python-examples/01-agent-with-tools) | [TypeScript](./typescript-examples/01-agent-with-tools) | Simple agent with tool calling |
| Structured Output | [Python](./python-examples/02-structured-output) | [TypeScript](./typescript-examples/02-structured-output) | Agent with structured model responses |
| Streaming | [Python](./python-examples/03-agent-streaming) | [TypeScript](./typescript-examples/03-agent-streaming) | Real-time streaming responses |
| Conversational Chat | [Python](./python-examples/04-conversational-chat) | [TypeScript](./typescript-examples/04-conversational-chat) | Multi-turn conversations with memory |
| Thinking Agent | [Python](./python-examples/05-thinking-agent) | [TypeScript](./typescript-examples/05-thinking-agent) | Chain-of-thought reasoning |
| Guardrails | [Python](./python-examples/06-guardrails) | [TypeScript](./typescript-examples/06-guardrails) | Input/output validation |
| Multi-Agent Coordination | [Python](./python-examples/14-router-coordinator) | [TypeScript](./typescript-examples/14-router-coordinator) | Workflow orchestrating multiple agents |
| Order Processing | [Python](./python-examples/17-order-processing) | [TypeScript](./typescript-examples/17-order-processing) | Human-in-the-loop fraud review |
| Sandbox Tools | [Python](./python-examples/18-sandbox-tools) | [TypeScript](./typescript-examples/18-sandbox-tools) | Code execution in an isolated Docker container |
| Exec Security | [Python](./python-examples/19-exec-security) | [TypeScript](./typescript-examples/19-exec-security) | Allowlist-based command approval |
| Web Search Agent | [Python](./python-examples/20-web-search-agent) | [TypeScript](./typescript-examples/20-web-search-agent) | Research agent with Tavily web search |
| Local Sandbox | [Python](./python-examples/21-local-sandbox) | [TypeScript](./typescript-examples/21-local-sandbox) | Sandbox tools running on the host machine |
### Workflows
| Example | Python | TypeScript | Description |
|---------|--------|------------|-------------|
| Workflow Basics | [Python](./python-examples/08-workflow-basics) | [TypeScript](./typescript-examples/08-workflow-basics) | Core workflow patterns |
| Suspend/Resume | [Python](./python-examples/09-suspend-resume) | [TypeScript](./typescript-examples/09-suspend-resume) | Human-in-the-loop approvals |
| State Persistence | [Python](./python-examples/10-state-persistence) | [TypeScript](./typescript-examples/10-state-persistence) | Durable state across executions |
| Error Handling | [Python](./python-examples/11-error-handling) | [TypeScript](./typescript-examples/11-error-handling) | Retry, fallback, compensation patterns |
| Queues & Concurrency | [Python](./python-examples/12-shared-queues) | [TypeScript](./typescript-examples/12-shared-queues) | Rate limiting and concurrency control |
| Parallel Execution | [Python](./python-examples/13-parallel-review) | [TypeScript](./typescript-examples/13-parallel-review) | Fan-out/fan-in patterns |
### Events & Scheduling
| Example | Python | TypeScript | Description |
|---------|--------|------------|-------------|
| Event-Triggered | [Python](./python-examples/15-event-triggered) | [TypeScript](./typescript-examples/15-event-triggered) | Pub/sub event-driven workflows |
| Scheduled Workflows | [Python](./python-examples/16-scheduled-workflow) | [TypeScript](./typescript-examples/16-scheduled-workflow) | Cron-based scheduling |
### Human-in-the-Loop
| Example | Python | TypeScript | Description |
|---------|--------|------------|-------------|
| Approval Page | [Python](./python-examples/22-approval-page) | [TypeScript](./typescript-examples/22-approval-page) | Web UI for workflow approval with suspend/resume |
---
## Under the Hood
Polos captures the result of every side effect - tool calls, API responses, time delays as a durable log.
If your process dies, Polos replays the workflow from the log, returning previously-recorded results instead of re-executing them.
Your agent’s exact local variables and call stack are restored in milliseconds.
**Completed steps are never re-executed - so you never pay for an LLM call twice.**
---
## Documentation
For detailed documentation, visit **[docs.polos.dev](https://docs.polos.dev)**
- 📖 [Quick Start Guide](https://docs.polos.dev/quickstart)
- 🤖 [Building Agents](https://docs.polos.dev/agents/overview)
- ⚙️ [Workflow Patterns](https://docs.polos.dev/workflows/overview)
- 📡 [Events](https://docs.polos.dev/workflows/event-triggered-workflows)
- ⏰ [Scheduling](https://docs.polos.dev/workflows/scheduled-workflows)
- 🔍 [Observability](https://docs.polos.dev/observability/tracing)
---
## Community
Join our community to get help, share ideas, and stay updated:
- ⭐ [Star us on GitHub](https://github.com/polos-dev/polos)
- 💬 [Join our Discord](https://discord.gg/ZAxHKMPwFG)
- 📖 [Read the Docs](https://docs.polos.dev)
---
## Contributing
We welcome contributions! Whether it's bug reports, feature requests, documentation improvements, or code contributions.
- 🐛 [Report Issues](https://github.com/polos-dev/polos/issues)
- 💡 [Feature Requests](https://github.com/polos-dev/polos/issues)
- 📖 [Contributing Guide](CONTRIBUTING.md)
---
## License
Polos is [Apache 2.0 licensed](LICENSE).