An open API service indexing awesome lists of open source software.

https://github.com/haasonsaas/agent-harness

Unified harness for hot-swapping between OpenAI Agents SDK and Anthropic Claude Agent SDK with shared tool registry
https://github.com/haasonsaas/agent-harness

agent-sdk ai-agents anthropic claude hot-swapping llm mcp openai python

Last synced: 9 months ago
JSON representation

Unified harness for hot-swapping between OpenAI Agents SDK and Anthropic Claude Agent SDK with shared tool registry

Awesome Lists containing this project

README

          

# Agent Harness - Unified OpenAI & Anthropic SDK Interface

[![CI](https://github.com/haasonsaas/agent-harness/workflows/CI/badge.svg)](https://github.com/haasonsaas/agent-harness/actions)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

A production-ready harness for **hot-swapping** between OpenAI Agents SDK and Anthropic Claude Agent SDK with a unified tool registry and common API.

## Key Features

✅ **Unified Tool Registry** - Register tools once, use with any provider
✅ **Hot-Swapping** - Switch providers at runtime without code changes
✅ **Lazy Loading** - SDKs imported only when needed
✅ **Streaming Support** - Real-time response streaming with consistent deltas
✅ **Provider Comparison** - Run same prompt on multiple providers in parallel
✅ **Type-Safe** - Full type hints and automatic JSON Schema generation
✅ **Production-Ready** - Error handling, retries, timeouts, structured logging
✅ **Resource Management** - Async context managers and proper cleanup
✅ **Thread-Safe** - Concurrent tool registration and execution
✅ **Extensible** - Easy to add new providers

## Architecture

```
┌─────────────────────────────────────────┐
│ AgentHarness │
│ • switch_provider() │
│ • run() / stream() │
│ • compare_providers() │
└─────────────┬───────────────────────────┘

┌──────┴──────┐
│ │
┌──────▼──────┐ ┌───▼──────────┐
│BaseHarness │ │ToolRegistry │
│(Abstract) │ │@register_tool│
└──────┬──────┘ └──────────────┘

┌────┴────┐
│ │
┌─▼─────┐ ┌─▼────────┐
│OpenAI │ │Claude │
│Harness│ │Harness │
└───┬───┘ └───┬──────┘
│ │
┌───▼───┐ ┌───▼──────┐
│OpenAI │ │Claude │
│Agents │ │Agent SDK │
│SDK │ │+ MCP │
└───────┘ └──────────┘
```

## Quick Start

### 1. Installation

```bash
# Install with optional dependencies
pip install -e ".[all]" # Both providers
pip install -e ".[openai]" # OpenAI only
pip install -e ".[anthropic]" # Claude only
pip install -e ".[dev]" # Development tools

# Or install SDKs separately
pip install openai-agents claude-agent-sdk
```

### 2. Set API Keys

```bash
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
```

### 3. Basic Usage

```python
import asyncio
from agent_harness import AgentHarness, HarnessConfig, register_tool

# Register a tool once - works with both providers
@register_tool(description="Get weather for a city")
def get_weather(city: str) -> str:
return f"Weather in {city}: sunny, 72°F"

async def main():
config = HarnessConfig(
system_prompt="You are a helpful assistant",
tool_names=["get_weather"],
max_turns=5,
timeout_sec=30.0
)

# Use async context manager for proper cleanup
async with AgentHarness(provider="openai", config=config) as harness:
result = await harness.run("What's the weather in Tokyo?")
print(f"OpenAI: {result.final_output}")
print(f"Latency: {result.latency_ms}ms")

# Hot-swap to Claude
await harness.switch_provider("claude")
result = await harness.run("What's the weather in Tokyo?")
print(f"Claude: {result.final_output}")
print(f"Latency: {result.latency_ms}ms")

asyncio.run(main())
```

## What's New in v0.2.0

🎉 **Major improvements for production use:**

- **Automatic JSON Schema generation** from Python type hints
- **Custom error taxonomy** with retryable errors
- **Structured logging** with request IDs and timing
- **Request timeouts and retry logic** with exponential backoff
- **Async context managers** for proper resource cleanup
- **Thread-safe registry** for concurrent operations
- **Enhanced configuration** with validation
- **Comprehensive test suite** with pytest
- **CI/CD pipeline** with GitHub Actions

See [CHANGELOG.md](CHANGELOG.md) for full details.

## Core Concepts

### Tool Registry

The `@register_tool` decorator adds tools to a global registry accessible by all providers:

```python
from agent_harness import register_tool

@register_tool(description="Add two numbers")
def add(a: float, b: float) -> float:
"""Adds two numbers together"""
return a + b

@register_tool(description="Calculate factorial")
def factorial(n: int) -> int:
"""Calculate n!"""
if n <= 1:
return 1
return n * factorial(n - 1)
```

Tools are automatically:
- Extracted with parameter types via introspection
- Wrapped for OpenAI using `function_tool`
- Wrapped for Claude using `@tool` + in-process MCP server
- Available to both providers without duplication

### Configuration

`HarnessConfig` provides unified configuration:

```python
from agent_harness import HarnessConfig

config = HarnessConfig(
system_prompt="You are an expert data analyst",
model="gpt-4o", # Provider-specific model
max_turns=10, # Max conversation turns
temperature=0.7, # LLM temperature (0.0-2.0)
timeout_sec=30.0, # Request timeout
max_output_tokens=1000, # Max output tokens
top_p=0.9, # Top-p sampling
stop_sequences=["STOP"], # Stop sequences
tool_names=["add", "multiply"], # Specific tools to use
retry_attempts=3, # Number of retries
retry_backoff=1.0, # Backoff multiplier
request_id="custom-id", # Custom request ID (auto-generated if None)
provider_options={ # Provider-specific options
"permission_mode": "acceptEdits" # Claude-specific
}
)
```

### Base Harness Interface

All providers implement `BaseHarness`:

```python
class BaseHarness(ABC):
@abstractmethod
async def run(self, prompt: str) -> AgentResponse:
"""Run agent and return final response"""
pass

@abstractmethod
async def stream(self, prompt: str) -> AsyncIterator[str]:
"""Stream responses in real-time"""
pass
```

## Advanced Usage

### Streaming Responses

```python
harness = AgentHarness(provider="openai", config=config)

async for chunk in harness.stream("Write a story about AI"):
print(chunk, end="", flush=True)
```

### Provider Comparison

Compare both providers side-by-side:

```python
results = await harness.compare_providers(
"Explain quantum computing in simple terms"
)

for provider, response in results.items():
print(f"\n{provider.upper()}:")
print(response.final_output)
```

### Dynamic Tool Registration

Register tools at runtime:

```python
from agent_harness import register_tool, get_registry

@register_tool(description="Convert miles to kilometers")
def miles_to_km(miles: float) -> float:
return miles * 1.60934

print(f"Registered: {list(get_registry().get_all().keys())}")
```

### Provider-Specific Features

#### OpenAI: Guardrails & Handoffs

```python
config = HarnessConfig(
system_prompt="You are an assistant",
provider_options={
"guardrails": [my_guardrail],
"handoffs": [other_agent]
}
)
```

#### Claude: MCP Servers & Hooks

```python
config = HarnessConfig(
system_prompt="You are a coding assistant",
provider_options={
"allowed_tools": ["Read", "Write", "Bash"],
"permission_mode": "acceptEdits",
"cwd": "/path/to/project",
"hooks": {
"PreToolUse": [check_bash_hook]
}
}
)
```

## API Reference

### AgentHarness

Main class for hot-swapping providers.

```python
class AgentHarness:
def __init__(
self,
provider: str = "openai",
config: Optional[HarnessConfig] = None,
api_key: Optional[str] = None
)

def switch_provider(self, provider: str) -> None
async def run(self, prompt: str) -> AgentResponse
async def stream(self, prompt: str) -> AsyncIterator[str]
async def compare_providers(
self,
prompt: str,
providers: Optional[list[str]] = None
) -> dict[str, AgentResponse]
```

**Providers:** `"openai"`, `"claude"`, `"anthropic"`

### HarnessConfig

```python
@dataclass
class HarnessConfig:
system_prompt: str = "You are a helpful assistant."
model: Optional[str] = None
max_turns: int = 10
temperature: float = 1.0
tool_names: Optional[list[str]] = None
provider_options: dict[str, Any] = field(default_factory=dict)
```

### AgentResponse

```python
@dataclass
class AgentResponse:
final_output: str
messages: list[Any] = field(default_factory=list)
metadata: dict[str, Any] = field(default_factory=dict)
```

### Tool Registry

```python
def register_tool(
name: Optional[str] = None,
description: Optional[str] = None
) -> Callable

def get_registry() -> ToolRegistry
```

## Implementation Details

### OpenAI Harness

- Lazily imports `agents` module
- Wraps tools using `function_tool` decorator
- Creates `Agent` with `model_config`
- Uses `Runner.run()` for execution
- Streams via `Runner.run_streamed()`

### Claude Harness

- Lazily imports `claude_agent_sdk`
- Wraps tools with `@tool` decorator
- Creates in-process MCP server via `create_sdk_mcp_server()`
- Uses `ClaudeAgentOptions` for configuration
- Executes via `query()` or `ClaudeSDKClient`
- Extracts `TextBlock` content from responses

## Examples

See [example_usage.py](example_usage.py) for comprehensive examples:

- ✅ Basic hot-swapping
- ✅ Tool usage with both providers
- ✅ Streaming responses
- ✅ Provider comparison
- ✅ Dynamic tool registration
- ✅ Provider-specific options

Run examples:

```bash
python example_usage.py
```

## Design Principles

1. **Single Responsibility**: Each provider adapter handles only its SDK
2. **Lazy Loading**: SDKs imported only when provider is selected
3. **Don't Repeat Yourself**: Tools registered once, used everywhere
4. **Open/Closed**: Easy to add new providers by extending `BaseHarness`
5. **Dependency Inversion**: Code depends on `BaseHarness` abstraction

## Comparison: OpenAI vs Claude

| Feature | OpenAI Agents SDK | Claude Agent SDK |
|---------|------------------|------------------|
| **Release** | March 2025 | 2025 |
| **Core API** | Responses API | Model Context Protocol (MCP) |
| **Tool Format** | `function_tool` decorator | `@tool` + MCP server |
| **Streaming** | `run_streamed()` | `query()` iterator |
| **Built-in Tools** | Web search, file search, code interpreter | Read, Write, Bash, etc. |
| **Handoffs** | Native support | Programmatic subagents |
| **Guardrails** | Built-in | Hooks (PreToolUse, PostToolUse) |
| **Sessions** | SQLite, Redis | Session forking |

## Extending the Harness

To add a new provider:

```python
class MyProviderHarness(BaseHarness):
async def run(self, prompt: str) -> AgentResponse:
# Import SDK lazily
from my_sdk import Agent

# Get registered tools
tools = self._get_registered_tools()

# Wrap and execute
# ...

return AgentResponse(...)

async def stream(self, prompt: str) -> AsyncIterator[str]:
# Implement streaming
# ...

# Register in AgentHarness.PROVIDERS
AgentHarness.PROVIDERS["myprovider"] = MyProviderHarness
```

## Testing

```bash
# Run examples
python example_usage.py

# Test with both providers
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
python example_usage.py
```

## Repository Structure

```
agent_sdk/
├── agent_harness.py # Core harness implementation
├── example_usage.py # Comprehensive examples
├── README.md # This file
├── openai-agents-python/ # OpenAI SDK (cloned)
└── claude-agent-sdk-python/ # Claude SDK (cloned)
```

## License

MIT