https://github.com/dravr-ai/dravr-embacle

Rust library wrapping 12 AI CLI tools as pluggable LLM providers — with OpenAI API client, ACP headless mode, agent loop, MCP server, and OpenAI-compatible REST API
https://github.com/dravr-ai/dravr-embacle
acp agent claude-code cline codex copilot cursor gemini goose kilo kiro llm mcp openai-compatible opencode rust-lang warp
Last synced: 2 months ago
JSON representation
Rust library wrapping 12 AI CLI tools as pluggable LLM providers — with OpenAI API client, ACP headless mode, agent loop, MCP server, and OpenAI-compatible REST API
Host: GitHub
URL: https://github.com/dravr-ai/dravr-embacle
Owner: dravr-ai
License: other
Created: 2026-02-26T05:48:17.000Z (4 months ago)
Default Branch: main
Last Pushed: 2026-04-18T19:12:25.000Z (2 months ago)
Last Synced: 2026-04-18T21:14:03.754Z (2 months ago)
Topics: acp, agent, claude-code, cline, codex, copilot, cursor, gemini, goose, kilo, kiro, llm, mcp, openai-compatible, opencode, rust-lang, warp
Language: Rust
Homepage: https://docs.rs/embacle
Size: 672 KB
Stars: 3
Watchers: 0
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE-APACHE
- Security: SECURITY.md
- Agents: AGENTS.md
Awesome Lists containing this project

README

          # Embacle — LLM Runners

[![crates.io](https://img.shields.io/crates/v/embacle.svg)](https://crates.io/crates/embacle)

[![docs.rs](https://docs.rs/embacle/badge.svg)](https://docs.rs/embacle)

[![CI](https://github.com/dravr-ai/dravr-embacle/actions/workflows/ci.yml/badge.svg)](https://github.com/dravr-ai/dravr-embacle/actions/workflows/ci.yml)

[![License](https://img.shields.io/badge/license-Apache--2.0-blue.svg)](LICENSE.md)

Standalone Rust library that wraps 12 AI CLI tools and SDKs as pluggable LLM providers, with vision/image support.

Instead of integrating with LLM APIs directly (which require API keys, SDKs, and managing auth), **Embacle** delegates to CLI tools that users already have installed and authenticated — getting model upgrades, auth management, and protocol handling for free. For GitHub Copilot, an optional headless mode communicates via the ACP (Agent Client Protocol) for SDK-managed tool calling.

## Run It

Embacle ships as two ready-to-run servers — no code required:

**OpenAI-compatible HTTP server** — drop-in replacement for any OpenAI client:

```bash

embacle-server --provider copilot --port 3000

```

```bash

curl http://localhost:3000/v1/chat/completions \

  -H "Content-Type: application/json" \

  -d '{"model": "copilot", "messages": [{"role": "user", "content": "hello"}]}'

```

**MCP server** — connect Claude Desktop, editors, or any MCP client directly:

```bash

# stdio (editor integration)

embacle-mcp --provider copilot

# HTTP (network-accessible)

embacle-mcp --transport http --port 3000 --provider copilot

```

Both modes support all 12 CLI runners, streaming, model routing, and vision. See [REST API Server](#rest-api-server-embacle-server) and [MCP Server](#mcp-server-embacle-mcp) for full details.

## Table of Contents

- [Install](#install)

- [Supported Runners](#supported-runners)

- [Quick Start](#quick-start)

- [REST API Server](#rest-api-server-embacle-server)

- [MCP Server](#mcp-server-embacle-mcp)

- [OpenAI API](#openai-api-feature-flag)

- [Copilot Headless](#copilot-headless-feature-flag)

- [Vision / Image Support](#vision--image-support)

- [Docker](#docker)

- [C FFI Static Library](#c-ffi-static-library)

- [Architecture](#architecture)

- [Tested With](#tested-with)

- [License](#license)

## Install

### Homebrew (macOS / Linux) — recommended

```bash

brew tap dravr-ai/tap

brew install embacle

```

This installs two binaries:

- **`embacle-server`** — OpenAI-compatible REST API + MCP server

- **`embacle-mcp`** — standalone MCP server for editor integration

### Docker

```bash

docker pull ghcr.io/dravr-ai/embacle:latest

docker run -p 3000:3000 ghcr.io/dravr-ai/embacle --provider copilot

```

### Cargo (library)

```toml

[dependencies]

embacle = "0.15"

```

## Supported Runners

### CLI Runners (subprocess-based)

| Runner | Binary | Features |

|--------|--------|----------|

| Claude Code | `claude` | JSON output, streaming, system prompts, session resume |

| GitHub Copilot | `copilot` | Text parsing, streaming |

| Cursor Agent | `cursor-agent` | JSON output, streaming, MCP approval |

| OpenCode | `opencode` | JSON events, session management |

| Gemini CLI | `gemini` | JSON/stream-JSON output, streaming, session resume |

| Codex CLI | `codex` | JSONL output, streaming, sandboxed exec mode |

| Goose CLI | `goose` | JSON/stream-JSON output, streaming, no-session mode |

| Cline CLI | `cline` | NDJSON output, streaming, session resume via task IDs |

| Continue CLI | `cn` | JSON output, single-shot completions |

| Warp | `oz` | NDJSON output, conversation resume |

| Kiro CLI | `kiro-cli` | ANSI-stripped text output, auto model selection |

| Kilo Code | `kilo` | NDJSON output, streaming, token tracking, 500+ models via Kilo Gateway |

### HTTP API Runners (feature-flagged)

| Runner | Feature Flag | Features |

|--------|-------------|----------|

| OpenAI API | `openai-api` | Any OpenAI-compatible endpoint (OpenAI, Groq, Gemini, Ollama, vLLM), streaming, tool calling, model discovery |

### ACP Runners (persistent connection)

| Runner | Feature Flag | Features |

|--------|-------------|----------|

| GitHub Copilot Headless | `copilot-headless` | NDJSON/JSON-RPC via `copilot --acp`, SDK-managed tool calling, streaming |

## Quick Start

Use a CLI runner:

```rust

use std::path::PathBuf;

use embacle::{ClaudeCodeRunner, RunnerConfig};

use embacle::types::{ChatMessage, ChatRequest, LlmProvider};

#[tokio::main]

async fn main() -> Result<(), embacle::types::RunnerError> {

    let config = RunnerConfig::new(PathBuf::from("claude"));

    let runner = ClaudeCodeRunner::new(config);

    let request = ChatRequest::new(vec![

        ChatMessage::user("What is the capital of France?"),

    ]);

    let response = runner.complete(&request).await?;

    println!("{}", response.content);

    Ok(())

}

```

## REST API Server (`embacle-server`)

A unified OpenAI-compatible HTTP server with built-in MCP support that proxies requests to embacle runners. Any client that speaks the OpenAI chat completions API or MCP protocol can use it without modification. Supports `--transport stdio` for MCP-only mode (editor integration).

### Usage

```bash

# Start with default provider (copilot) on localhost:3000

embacle-server

# Specify provider and port

embacle-server --provider claude_code --port 8080 --host 0.0.0.0

# MCP-only mode via stdio (for editor/client integration)

embacle-server --transport stdio --provider copilot

```

### Endpoints

| Method | Path | Description |

|--------|------|-------------|

| `POST` | `/v1/chat/completions` | Chat completion (streaming and non-streaming) |

| `GET` | `/v1/models` | List available providers and models |

| `GET` | `/health` | Per-provider readiness check |

| `POST` | `/mcp` | MCP Streamable HTTP (JSON-RPC 2.0) |

### MCP Streamable HTTP

The server also speaks [MCP](https://modelcontextprotocol.io/) at `POST /mcp`, accepting JSON-RPC 2.0 requests. Any MCP-compatible client can connect over HTTP instead of stdio.

| Tool | Description |

|------|-------------|

| `prompt` | Send chat messages to an LLM provider, with optional `model` routing (e.g. `copilot:gpt-4o`) |

| `list_models` | List available providers and the server's default |

```bash

# MCP initialize handshake

curl http://localhost:3000/mcp \

  -H "Content-Type: application/json" \

  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"curl"}}}'

# Call the prompt tool

curl http://localhost:3000/mcp \

  -H "Content-Type: application/json" \

  -d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"prompt","arguments":{"messages":[{"role":"user","content":"hello"}]}}}'

```

Add `Accept: text/event-stream` to receive SSE-wrapped responses instead of plain JSON.

### Model Routing

The `model` field determines which provider handles the request. Use a `provider:model` prefix to target a specific runner, or pass a bare model name to use the server's default provider.

```bash

# Explicit provider

curl http://localhost:3000/v1/chat/completions \

  -H "Content-Type: application/json" \

  -d '{"model": "claude:opus", "messages": [{"role": "user", "content": "hello"}]}'

# Default provider

curl http://localhost:3000/v1/chat/completions \

  -H "Content-Type: application/json" \

  -d '{"model": "gpt-5.4", "messages": [{"role": "user", "content": "hello"}]}'

```

### Multiplex

Pass an array of models to fan out the same prompt to multiple providers concurrently. Each provider runs in its own task; failures in one don't affect others.

```bash

curl http://localhost:3000/v1/chat/completions \

  -H "Content-Type: application/json" \

  -d '{"model": ["copilot:gpt-4o", "claude:opus"], "messages": [{"role": "user", "content": "hello"}]}'

```

The response uses `object: "chat.completion.multiplex"` with per-provider results and timing.

Streaming is not supported for multiplex requests.

### SSE Streaming

Set `"stream": true` for Server-Sent Events output in OpenAI streaming format (`data: {json}\n\n` with `data: [DONE]` terminator).

### Authentication

Optional. Set `EMBACLE_API_KEY` to require bearer token auth on all endpoints. When unset, all requests are allowed through (localhost development mode). The env var is read per-request, so key rotation doesn't require a restart.

```bash

EMBACLE_API_KEY=my-secret embacle-server

curl http://localhost:3000/v1/models -H "Authorization: Bearer my-secret"

```

## MCP Server (`embacle-mcp`)

A library and standalone binary that exposes embacle runners via the [Model Context Protocol](https://modelcontextprotocol.io/). Connect any MCP-compatible client (Claude Desktop, editors, custom agents) to use all embacle providers.

### Usage

```bash

# Stdio transport (default — for editor/client integration)

embacle-mcp --provider copilot

# HTTP transport (for network-accessible deployments)

embacle-mcp --transport http --host 0.0.0.0 --port 3000 --provider claude_code

```

### MCP Tools

| Tool | Description |

|------|-------------|

| `get_provider` | Get active LLM provider and list available providers |

| `set_provider` | Switch the active provider (`claude_code`, `copilot`, `copilot_headless`, `cursor_agent`, `opencode`, `gemini_cli`, `codex_cli`, `goose_cli`, `cline_cli`, `continue_cli`, `warp_cli`, `kiro_cli`, `kilo_cli`) |

| `get_model` | Get current model and list available models for the active provider |

| `set_model` | Set the model for subsequent requests (pass null to reset to default) |

| `get_multiplex_provider` | Get providers configured for multiplex dispatch |

| `set_multiplex_provider` | Configure providers for fan-out mode |

| `prompt` | Send chat messages to the active provider, or multiplex to all configured providers |

### Client Configuration

Add to your MCP client config (e.g. Claude Desktop `claude_desktop_config.json`):

```json

{

  "mcpServers": {

    "embacle": {

      "command": "embacle-mcp",

      "args": ["--provider", "copilot"]

    }

  }

}

```

## OpenAI API (feature flag)

Enable the `openai-api` feature for HTTP-based communication with any OpenAI-compatible endpoint:

```toml

[dependencies]

embacle = { version = "0.15", features = ["openai-api"] }

```

```rust

use embacle::{OpenAiApiConfig, OpenAiApiRunner};

use embacle::types::{ChatMessage, ChatRequest, LlmProvider};

#[tokio::main]

async fn main() -> Result<(), embacle::types::RunnerError> {

    // Reads OPENAI_API_BASE_URL, OPENAI_API_KEY, OPENAI_API_MODEL from env

    let config = OpenAiApiConfig::from_env();

    let runner = OpenAiApiRunner::new(config).await;

    let request = ChatRequest::new(vec![

        ChatMessage::user("What is the capital of France?"),

    ]);

    let response = runner.complete(&request).await?;

    println!("{}", response.content);

    Ok(())

}

```

Works with any OpenAI-compatible endpoint — OpenAI, Groq, Google Gemini, Ollama, vLLM, and more. To inject a shared HTTP client (e.g. from a connection pool), use `OpenAiApiRunner::with_client(config, client)`.

| Variable | Default | Description |

|----------|---------|-------------|

| `OPENAI_API_BASE_URL` | `https://api.openai.com/v1` | API base URL |

| `OPENAI_API_KEY` | *(none)* | Bearer token for authentication |

| `OPENAI_API_MODEL` | `gpt-5.4` | Default model for completions |

| `OPENAI_API_TIMEOUT_SECS` | `300` | HTTP request timeout |

## AG-UI Progress Events (feature flag)

Enable the `agui` feature to expose the

[AG-UI protocol](https://github.com/ag-ui-protocol/ag-ui) event vocabulary —

the canonical types agents use to broadcast run / step / tool-call / text

progress to user-facing clients.

```toml

[dependencies]

embacle = { version = "0.15", features = ["agui"] }

```

The module is deliberately transport-agnostic: it ships the event enum

(`AgUiEvent`), the filter (`AgUiEventFilter`), and the emitter trait

(`AgUiEmitter`) plus a `NoopEmitter` default. HTTP routing, SSE framing,

and pipeline wiring live in downstream crates that know their runtime

(e.g. [`dravr-platform`](https://github.com/dravr-ai/dravr-platform)'s

`/api/agui/runs/{run_id}/stream` and [`dravr-canot`](https://github.com/dravr-ai/dravr-canot)'s

messaging adapters).

```rust

use embacle::agui::{AgUiEvent, AgUiEventFilter, AgUiEventKind, AgUiEmitter, NoopEmitter};

// Opt out of high-volume per-token text deltas on a Telegram channel.

let filter = AgUiEventFilter::allow_all().without(AgUiEventKind::TextMessageContent);

let sink = NoopEmitter::new(filter);

let event = AgUiEvent::run_started("run_abc", Some("thread_xyz"));

let _ = sink.emit(&event);

```

## Copilot Headless (feature flag)

Enable the `copilot-headless` feature for ACP-based communication with SDK-managed tool calling:

```toml

[dependencies]

embacle = { version = "0.15", features = ["copilot-headless"] }

```

```rust

use embacle::{CopilotHeadlessRunner, CopilotHeadlessConfig};

use embacle::types::{ChatMessage, ChatRequest, LlmProvider};

#[tokio::main]

async fn main() -> Result<(), embacle::types::RunnerError> {

    // Reads COPILOT_HEADLESS_MODEL, COPILOT_GITHUB_TOKEN, etc. from env

    let runner = CopilotHeadlessRunner::from_env();

    let request = ChatRequest::new(vec![

        ChatMessage::user("Explain Rust ownership"),

    ]);

    let response = runner.complete(&request).await?;

    println!("{}", response.content);

    Ok(())

}

```

The headless runner spawns `copilot --acp` per request and communicates via NDJSON-framed JSON-RPC. The system prompt is passed via ACP's `session/new` `systemPrompt` parameter. Conversation history from prior turns is serialized into a `` block in the prompt text for multi-turn continuity. The `max_tokens` field from `ChatRequest` is forwarded to ACP's `session/prompt` as `maxTokens`.

Configuration via environment variables:

| Variable | Default | Description |

|----------|---------|-------------|

| `COPILOT_CLI_PATH` | auto-detect | Override path to copilot binary |

| `COPILOT_HEADLESS_MODEL` | top entry of ranked catalog (see `copilot_models::CATALOG`) | Default model for completions |

| `COPILOT_GITHUB_TOKEN` | stored OAuth | GitHub auth token (falls back to `GH_TOKEN`, `GITHUB_TOKEN`) |

| `COPILOT_HEADLESS_MAX_HISTORY_TURNS` | `20` | Max conversation history turns in prompt (0 disables) |

| `COPILOT_HEADLESS_INJECT_SYSTEM_IN_PROMPT` | `true` | Prepend system prompt as plain text in prompt (set `false` to rely on ACP `systemPrompt` only) |

## Vision / Image Support

Embacle supports sending images alongside text prompts via the `ImagePart` type. Images are base64-encoded and tagged with a MIME type (PNG, JPEG, WebP, GIF).

### Which providers support vision?

| Provider | Vision | How |

|----------|--------|-----|

| Copilot Headless (ACP) | Native | Images sent as ACP `image` content blocks |

| OpenAI API | Native | Images sent as `image_url` parts with `data:` URIs |

| C FFI | Native | Images forwarded to copilot headless via `image_url` content |

| All 12 CLI runners | Tempfile | Images decoded to temp files, file paths injected into prompt |

CLI runners materialize base64 images to a temp directory and append `[Attached images]` with file paths to the user message. The temp directory is kept alive until the subprocess finishes.

### Library usage

```rust

use embacle::types::{ChatMessage, ChatRequest, ImagePart};

let image = ImagePart::new(base64_data, "image/png")?;

let request = ChatRequest::new(vec![

    ChatMessage::user_with_images("What do you see?", vec![image]),

]);

```

### Server usage (OpenAI multipart content)

Send images via the standard OpenAI multipart content format:

```bash

curl http://localhost:3000/v1/chat/completions \

  -H "Content-Type: application/json" \

  -d '{

    "model": "copilot_headless",

    "messages": [{

      "role": "user",

      "content": [

        {"type": "text", "text": "What do you see in this image?"},

        {"type": "image_url", "image_url": {"url": "data:image/png;base64,iVBOR..."}}

      ]

    }]

  }'

```

Plain string messages continue to work unchanged. All providers accept images — native providers send them directly, CLI runners materialize them to temp files.

## Docker

Pull the image from GitHub Container Registry:

```bash

docker pull ghcr.io/dravr-ai/embacle:latest

```

The image includes `embacle-server` and `embacle-mcp` with Node.js pre-installed for adding CLI backends.

### Adding a CLI Backend

The base image doesn't include CLI tools. Install them in a derived image:

```dockerfile

FROM ghcr.io/dravr-ai/embacle

USER root

RUN npm install -g @anthropic-ai/claude-code

USER embacle

```

Build and run:

```bash

docker build -t my-embacle .

docker run -p 3000:3000 my-embacle --provider claude_code

```

### Auth and Configuration

CLI tools store auth tokens in their config directories. Mount them from the host, or set provider-specific env vars:

```bash

# Mount Claude Code auth from host

docker run -p 3000:3000 \

  -v ~/.claude:/home/embacle/.claude:ro \

  my-embacle --provider claude_code

# Or pass env vars if the CLI supports them

docker run -p 3000:3000 \

  -e GITHUB_TOKEN=ghp_... \

  -e EMBACLE_API_KEY=my-secret \

  my-embacle --provider copilot

```

### Running embacle-mcp

Override the entrypoint to run the MCP server instead:

```bash

docker run --entrypoint embacle-mcp ghcr.io/dravr-ai/embacle --provider copilot

```

## C FFI Static Library

Embacle provides a C FFI static library (`libembacle.a`) that exposes copilot chat completion to any language that can call C functions — Swift, Objective-C, Python, Go, Ruby, and more. The FFI surface is 4 functions: init, chat completion, free string, and shutdown.

### Install via Homebrew

```bash

brew tap dravr-ai/tap

brew install embacle-ffi

```

This builds from source (requires Rust) and installs `libembacle.a` and `embacle.h` to Homebrew's prefix. The formula is published automatically with each release.

For CI environments:

```bash

brew tap dravr-ai/tap

brew install embacle-ffi

# libembacle.a and embacle.h are now available under $(brew --prefix)/lib and $(brew --prefix)/include

```

### Install via script

```bash

./scripts/install-ffi.sh                        # → /usr/local

./scripts/install-ffi.sh --prefix $HOME/.local  # → custom prefix

./scripts/install-ffi.sh --uninstall            # remove

```

### Build manually

```bash

cargo build --release --features ffi

# Output: target/release/libembacle.a

# Header: include/embacle.h

```

### Swift / SPM example

For Swift Package Manager, add a `systemLibrary` target in your `Package.swift` with a modulemap that links `embacle`:

```swift

.systemLibrary(name: "CEmbacle")

```

With a `module.modulemap`:

```

module CEmbacle {

    header "embacle.h"

    link "embacle"

    export *

}

```

The FFI accepts OpenAI-compatible JSON — the same format as the REST API:

```c

embacle_init();

char* response = embacle_chat_completion(

    "{\"messages\":[{\"role\":\"user\",\"content\":\"hello\"}]}",

    60  /* timeout seconds */

);

/* use response JSON... */

embacle_free_string(response);

embacle_shutdown();

```

Vision payloads work via multipart content with `image_url` data URIs.

## Architecture

```

Your Application

    └── embacle (this library)

            │

            ├── CLI Runners (subprocess per request)

            │   ├── ClaudeCodeRunner    → spawns `claude -p "prompt" --output-format json`

            │   ├── CopilotRunner       → spawns `copilot -p "prompt"`

            │   ├── CursorAgentRunner   → spawns `cursor-agent -p "prompt" --output-format json`

            │   ├── OpenCodeRunner      → spawns `opencode run "prompt" --format json`

            │   ├── GeminiCliRunner     → spawns `gemini -p "prompt" -o json -y`

            │   ├── CodexCliRunner      → spawns `codex exec "prompt" --json --full-auto`

            │   ├── GooseCliRunner      → spawns `goose run --quiet --no-session`

            │   ├── ClineCliRunner      → spawns `cline task --json --act --yolo`

            │   ├── ContinueCliRunner   → spawns `cn -p --format json`

            │   ├── WarpCliRunner       → spawns `oz agent run --prompt "..." --output-format json`

            │   ├── KiroCliRunner       → spawns `kiro-cli send "prompt"`

            │   └── KiloCliRunner       → spawns `kilo run --auto --format json`

            │

            ├── HTTP API Runners (behind feature flag)

            │   └── OpenAiApiRunner       → reqwest to any OpenAI-compatible endpoint

            │

            ├── ACP Runners (persistent connection, behind feature flag)

            │   └── CopilotHeadlessRunner → NDJSON/JSON-RPC to `copilot --acp`

            │

            ├── Provider Decorators (composable wrappers)

            │   ├── FallbackProvider    → ordered chain with retry and exponential backoff

            │   ├── MetricsProvider     → latency, token, and cost tracking

            │   ├── QualityGateProvider → response validation with retry

            │   ├── GuardrailProvider   → pluggable pre/post request validation

            │   └── CacheProvider       → response caching with TTL and capacity

            │

            ├── Agent Loop

            │   └── AgentExecutor       → multi-turn tool calling with configurable max turns

            │

            ├── Structured Output

            │   └── request_structured_output()  → schema-validated JSON extraction with retry

            │

            ├── MCP Tool Bridge

            │   └── McpToolBridge       → MCP tool definitions ↔ text-based tool loop

            │

            ├── MCP Server (library + binary crate, powered by dravr-tronc)

            │   └── embacle-mcp         → JSON-RPC 2.0 over stdio or HTTP/SSE

            │

            ├── Unified REST API + MCP Server (binary crate, powered by dravr-tronc)

            │   └── embacle-server      → OpenAI-compatible HTTP, MCP Streamable HTTP, SSE streaming, multiplex

            │

            └── Tool Simulation (text-based tool calling for CLI runners)

                └── execute_with_text_tools()  → catalog injection, XML parsing, tool loop

```

All runners implement the same `LlmProvider` trait:

- **`complete()`** — single-shot completion

- **`complete_stream()`** — streaming completion

- **`health_check()`** — verify the runner is available and authenticated

For detailed API docs — fallback chains, structured output, agent loop, metrics, quality gates, tool simulation, and more — see [docs.rs/embacle](https://docs.rs/embacle).

## Tested With

Embacle has been tested with [mirroir.dev](https://github.com/jfarcand/mirroir-mcp), an MCP server for AI-powered iPhone automation.

## License

Licensed under the Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE) or ).
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dravr-ai/dravr-embacle

Awesome Lists containing this project

README