https://github.com/hiddenpath/ai-lib-rust

Last synced: 4 months ago
JSON representation
Host: GitHub
URL: https://github.com/hiddenpath/ai-lib-rust
Owner: hiddenpath
License: apache-2.0
Created: 2025-12-31T12:55:08.000Z (6 months ago)
Default Branch: main
Last Pushed: 2026-02-15T06:54:32.000Z (4 months ago)
Last Synced: 2026-02-15T12:57:51.621Z (4 months ago)
Language: Rust
Size: 594 KB
Stars: 4
Watchers: 0
Forks: 0
Open Issues: 2
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project

README

          # ai-lib-rust

**Protocol Runtime for AI-Protocol** - A high-performance Rust reference implementation.

`ai-lib-rust` is the Rust runtime implementation for the [AI-Protocol](https://github.com/hiddenpath/ai-protocol) specification. It embodies the core design principle: **一切逻辑皆算子，一切配置皆协议** (All logic is operators, all configuration is protocol).

## 🎯 Design Philosophy

Unlike traditional adapter libraries that hardcode provider-specific logic, `ai-lib-rust` is a **protocol-driven runtime** that executes AI-Protocol specifications. This means:

- **Zero hardcoded provider logic**: All behavior is driven by protocol manifests (source YAML or dist JSON)

- **Operator-based architecture**: Processing is done through composable operators (Decoder → Selector → Accumulator → FanOut → EventMapper)

- **Hot-reloadable**: Protocol configurations can be updated without restarting the application

- **Unified interface**: Developers interact with a single, consistent API regardless of the underlying provider

## 🏗️ Architecture

The library is organized into three layers:

### 1. Protocol Specification Layer (`protocol/`)

- **Loader**: Loads protocol files from local filesystem, embedded assets, or remote URLs

- **Validator**: Validates protocols against JSON Schema

- **Schema**: Protocol structure definitions

### 2. Pipeline Interpreter Layer (`pipeline/`)

- **Decoder**: Parses raw bytes into protocol frames (SSE, JSON Lines, etc.)

- **Selector**: Filters frames using JSONPath expressions

- **Accumulator**: Accumulates stateful data (e.g., tool call arguments)

- **FanOut**: Handles multi-candidate scenarios

- **EventMapper**: Converts protocol frames to unified events

### 3. User Interface Layer (`client/`, `types/`)

- **Client**: Unified client interface

- **Types**: Standard type system based on AI-Protocol `standard_schema`

## 🔄 V2 Protocol Alignment

Starting with v0.7.0, `ai-lib-rust` aligns with the **AI-Protocol V2** specification. V0.8.0 adds full V2 runtime support including V2 manifest parsing, provider drivers, MCP, Computer Use, and extended multimodal.

### Standard Error Codes (V2)

All provider errors are classified into 13 standard error codes with unified retry/fallback semantics:

| Code | Name | Retryable | Fallbackable |

|------|------|-----------|--------------|

| E1001 | `invalid_request` | No | No |

| E1002 | `authentication` | No | Yes |

| E1003 | `permission_denied` | No | No |

| E1004 | `not_found` | No | No |

| E1005 | `request_too_large` | No | No |

| E2001 | `rate_limited` | Yes | Yes |

| E2002 | `quota_exhausted` | No | Yes |

| E3001 | `server_error` | Yes | Yes |

| E3002 | `overloaded` | Yes | Yes |

| E3003 | `timeout` | Yes | Yes |

| E4001 | `conflict` | Yes | No |

| E4002 | `cancelled` | No | No |

| E9999 | `unknown` | No | No |

Classification follows a priority pipeline: provider-specific error code → HTTP status override → standard HTTP mapping → `E9999`.

### Compliance Tests

Cross-runtime behavioral consistency is verified by a shared YAML-based test suite from the `ai-protocol` repository:

```bash

# Run compliance tests

cargo test --test compliance

# With explicit compliance directory

COMPLIANCE_DIR=../ai-protocol/tests/compliance cargo test --test compliance

```

For details, see [CROSS_RUNTIME.md](https://github.com/hiddenpath/ai-protocol/blob/main/docs/CROSS_RUNTIME.md).

### Testing with ai-protocol-mock

For integration and MCP tests without real API calls, use [ai-protocol-mock](https://github.com/hiddenpath/ai-protocol-mock):

```bash

# Start mock server (from ai-protocol-mock repo)

docker-compose up -d

# Run tests with mock

MOCK_HTTP_URL=http://localhost:4010 MOCK_MCP_URL=http://localhost:4010/mcp cargo test -- --ignored --nocapture

# Run specific mock integration tests

MOCK_HTTP_URL=http://localhost:4010 cargo test test_sse_streaming_via_mock test_error_classification_via_mock -- --ignored --nocapture

```

Or in code: `AiClientBuilder::new().base_url_override("http://localhost:4010").build(...)`

## 🧩 Feature flags & re-exports

`ai-lib-rust` keeps the runtime core small, and exposes optional capabilities behind feature flags. This aligns with the V2 "lean core, progressive complexity" design principle.

For a deeper overview, see [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md).

- **Always available re-exports (crate root)**:

  - `AiClient`, `AiClientBuilder`, `CancelHandle`, `CallStats`, `ChatBatchRequest`, `ClientMetrics`, `EndpointExt`

  - `Message`, `MessageRole`, `StreamingEvent`, `ToolCall`

  - `Result`, `Error`, `ErrorContext`

  - `FeedbackEvent`, `FeedbackSink` (core feedback types)

- **Capability features (V2 aligned)**:

  - **`embeddings`**: embedding generation (`EmbeddingClient`)

  - **`batch`**: batch API processing (`BatchExecutor`)

  - **`guardrails`**: input/output validation

  - **`tokens`**: token counting and cost estimation

  - **`telemetry`**: advanced observability sinks (`InMemoryFeedbackSink`, `ConsoleFeedbackSink`, etc.)

  - **`mcp`**: MCP (Model Context Protocol) tool bridge — namespace-based tool conversion and filtering

  - **`computer_use`**: Computer Use abstraction — safety policies, domain allowlists, action validation

  - **`multimodal`**: Extended multimodal support — vision, audio, video modality validation and format checks

  - **`reasoning`**: Extended reasoning / chain-of-thought support

- **Infrastructure features**:

  - **`routing_mvp`**: pure logic model management helpers (`CustomModelManager`, `ModelArray`, etc.)

  - **`interceptors`**: application-layer call hooks (`InterceptorPipeline`, `Interceptor`, `RequestContext`)

- **Meta-feature**:

  - **`full`**: enables all capability and infrastructure features

Enable with:

```toml

[dependencies]

# Lean core (default)

ai-lib-rust = "0.8.0"

# With specific capabilities

ai-lib-rust = { version = "0.8.0", features = ["embeddings", "telemetry"] }

# Everything enabled

ai-lib-rust = { version = "0.8.0", features = ["full"] }

```

## 🗺️ Capability map (layered tools)

This is a structured view of what the crate provides, grouped by layers.

### 1) Protocol layer (`src/protocol/`)

- **`ProtocolLoader`**: load provider manifests from local paths / env paths / GitHub raw URLs

- **`ProtocolValidator`**: JSON Schema validation (supports offline via embedded schema)

- **`ProtocolManifest`**: typed representation of provider manifests

- **`UnifiedRequest`**: provider-agnostic request payload used by the runtime

### 2) Transport layer (`src/transport/`)

- **`HttpTransport`**: reqwest-based transport with proxy/timeout defaults and env knobs

- **API key resolution**: keyring → `_API_KEY` env

### 3) Pipeline layer (`src/pipeline/`)

- **Operator pipeline**: decoder → selector → accumulator → fanout → event mapper

- **Streaming normalization**: maps provider frames to `StreamingEvent`

### 4) Client layer (`src/client/`)

- **`AiClient`**: runtime entry point; model-driven (`"provider/model"`)

- **Chat builder**: `client.chat().messages(...).stream().execute_stream()`

- **Batch**: `chat_batch`, `chat_batch_smart`

- **Observability**: `call_model_with_stats` returns `CallStats`

- **Cancellation**: `execute_stream_with_cancel()` → `CancelHandle`

- **Services**: `EndpointExt` for calling `services` declared in protocol manifests

### 5) Resilience layer (`src/resilience/` + `client/policy`)

- **Policy engine**: capability validation + retry/fallback decisions

- **Rate limiter**: token-bucket + adaptive header-driven mode

- **Circuit breaker**: minimal breaker with env or builder defaults

- **Backpressure**: max in-flight permit gating

### 6) Types layer (`src/types/`)

- **Messages**: `Message`, `MessageRole`, `MessageContent`, `ContentBlock`

- **Tools**: `ToolDefinition`, `FunctionDefinition`, `ToolCall`

- **Events**: `StreamingEvent`

### 7) Telemetry layer (`src/telemetry/`)

- **`FeedbackSink`** / **`FeedbackEvent`**: opt-in feedback reporting

- **Extended feedback types**: `RatingFeedback`, `ThumbsFeedback`, `TextFeedback`, `CorrectionFeedback`, `RegenerateFeedback`, `StopFeedback`

- **Multiple sinks**: `InMemoryFeedbackSink`, `ConsoleFeedbackSink`, `CompositeFeedbackSink`

- **Global sink management**: `get_feedback_sink()`, `set_feedback_sink()`, `report_feedback()`

### 8) Embedding layer (`src/embeddings/`) - NEW in v0.6.5

- **`EmbeddingClient`** / **`EmbeddingClientBuilder`**: Generate embeddings from text

- **Types**: `Embedding`, `EmbeddingRequest`, `EmbeddingResponse`, `EmbeddingUsage`

- **Vector operations**: `cosine_similarity`, `dot_product`, `euclidean_distance`, `manhattan_distance`

- **Utilities**: `normalize_vector`, `average_vectors`, `weighted_average_vectors`, `find_most_similar`

### 9) Cache layer (`src/cache/`) - NEW in v0.6.5

- **`CacheBackend`** trait with `MemoryCache` and `NullCache` implementations

- **`CacheManager`**: TTL-based caching with statistics

- **`CacheKey`** / **`CacheKeyGenerator`**: Deterministic cache key generation

### 10) Token layer (`src/tokens/`) - NEW in v0.6.5

- **`TokenCounter`** trait: `CharacterEstimator`, `AnthropicEstimator`, `CachingCounter`

- **`ModelPricing`**: Pre-configured pricing for GPT-4o, Claude models

- **`CostEstimate`**: Calculate request costs

### 11) Batch layer (`src/batch/`) - NEW in v0.6.5

- **`BatchCollector`** / **`BatchConfig`**: Accumulate requests for batch processing

- **`BatchExecutor`**: Execute batches with configurable strategies

- **`BatchResult`**: Structured batch execution results

### 12) Plugin layer (`src/plugins/`) - NEW in v0.6.5

- **`Plugin`** trait with lifecycle hooks

- **`PluginRegistry`**: Centralized plugin management

- **Hook system**: `HookType`, `Hook`, `HookManager`

- **Middleware**: `Middleware`, `MiddlewareChain` for request/response transformation

### 13) Utils (`src/utils/`)

- JSONPath mapping helpers, tool-call assembler, and small runtime utilities

### 14) Optional helpers (feature-gated)

- **`routing_mvp`** (`src/routing/`): model selection + endpoint array load balancing (pure logic)

- **`interceptors`** (`src/interceptors/`): hooks around calls for logging/metrics/audit

## 🚀 Quick Start

### Sharing the client across tasks

`AiClient` does not implement `Clone` (by design, for API key and provider ToS compliance).

Use `Arc` to share across async tasks:

```rust

use ai_lib_rust::{AiClient, Message};

use std::sync::Arc;

#[tokio::main]

async fn main() -> ai_lib_rust::Result<()> {

    let client = Arc::new(AiClient::new("openai/gpt-4o").await?);

    // Pass Arc::clone(&client) to spawned tasks

    let handle = tokio::spawn({

        let c = Arc::clone(&client);

        async move { c.chat().messages(vec![Message::user("Hi")]).execute().await }

    });

    let _ = handle.await?;

    Ok(())

}

```

### Basic Usage

```rust

use ai_lib_rust::{AiClient, Message};

use ai_lib_rust::types::events::StreamingEvent;

use futures::StreamExt;

#[tokio::main]

async fn main() -> ai_lib_rust::Result<()> {

    // Create client directly using provider/model string

    // This is fully protocol-driven and supports any provider defined in ai-protocol manifests

    let client = AiClient::new("anthropic/claude-3-5-sonnet").await?;

    let messages = vec![Message::user("Hello!")];

    // Streaming (unified events)

    let mut stream = client

        .chat()

        .messages(messages)

        .temperature(0.7)

        .stream()

        .execute_stream()

        .await?;

    while let Some(event) = stream.next().await {

        match event? {

            StreamingEvent::PartialContentDelta { content, .. } => print!("{content}"),

            StreamingEvent::StreamEnd { .. } => break,

            _ => {}

        }

    }

    Ok(())

}

```

### Multimodal (Image / Audio)

Multimodal inputs are represented as `MessageContent::Blocks(Vec)`.

```rust

use ai_lib_rust::{Message, MessageRole};

use ai_lib_rust::types::message::{MessageContent, ContentBlock};

fn multimodal_message(image_path: &str) -> ai_lib_rust::Result {

    let blocks = vec![

        ContentBlock::text("Describe this image briefly."),

        ContentBlock::image_from_file(image_path)?,

    ];

    Ok(Message::with_content(

        MessageRole::User,

        MessageContent::blocks(blocks),

    ))

}

```

### Useful environment variables

- `AI_PROTOCOL_DIR` / `AI_PROTOCOL_PATH`: path to your local `ai-protocol` repo root (containing `v1/`)

- `AI_LIB_ATTEMPT_TIMEOUT_MS`: per-attempt timeout guard used by the unified policy engine

- `AI_LIB_BATCH_CONCURRENCY`: override concurrency limit for batch operations

### Custom Protocol

```rust

use ai_lib_rust::protocol::ProtocolLoader;

let loader = ProtocolLoader::new()

    .with_base_path("./ai-protocol")

    .with_hot_reload(true);

let manifest = loader.load_provider("openai").await?;

```

## 📦 Installation

Add to your `Cargo.toml`:

```toml

[dependencies]

ai-lib-rust = "0.8.0"

tokio = { version = "1.0", features = ["full"] }

futures = "0.3"

```

## 🔧 Configuration

The library automatically looks for protocol manifests in the following locations (in order):

1. Custom path set via `ProtocolLoader::with_base_path()`

2. `AI_PROTOCOL_DIR` / `AI_PROTOCOL_PATH` (local path or GitHub raw URL)

3. Common dev paths: `ai-protocol/`, `../ai-protocol/`, `../../ai-protocol/`

4. Last resort: GitHub raw `hiddenpath/ai-protocol` (main)

For each base path, provider manifests are resolved in a backward-compatible order:

`dist/v1/providers/.json` → `v1/providers/.yaml`.

Protocol manifests should follow the AI-Protocol v1.5 specification structure. The runtime validates manifests against the official JSON Schema from the AI-Protocol repository.

## 🔐 Provider Requirements (API Keys)

Most providers require an API key. The runtime reads keys from (in order):

1. **OS Keyring** (optional, convenience feature)

   - **Windows**: Uses Windows Credential Manager

   - **macOS**: Uses Keychain

   - **Linux**: Uses Secret Service API

   - Service: `ai-protocol`, Username: provider id

   - **Note**: Keyring is optional and may not work in containers/WSL. Falls back to environment variables automatically.

2. **Environment Variable** (recommended for production)

   - Format: `_API_KEY` (e.g. `DEEPSEEK_API_KEY`, `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`)

   - **Recommended for**: CI/CD, containers, WSL, production deployments

**Example**:

```bash

# Set API key via environment variable (recommended)

export DEEPSEEK_API_KEY="sk-..."

export ANTHROPIC_API_KEY="sk-ant-..."

# Or use keyring (optional, for local development)

# Windows: Stored in Credential Manager

# macOS: Stored in Keychain

```

Provider-specific details vary, but `ai-lib-rust` normalizes them behind a unified client API.

## 🌐 Proxy / Timeout / Backpressure (Production knobs)

- **Proxy**: set `AI_PROXY_URL` (e.g. `http://user:pass@host:port`)

- **HTTP timeout**: set `AI_HTTP_TIMEOUT_SECS` (fallback: `AI_TIMEOUT_SECS`)

- **In-flight limit**: set `AI_LIB_MAX_INFLIGHT` or use `AiClientBuilder::max_inflight(n)`

- **Rate limiting** (optional): set either

  - `AI_LIB_RPS` (requests per second), or

  - `AI_LIB_RPM` (requests per minute)

- **Circuit breaker** (optional): enable via `AiClientBuilder::circuit_breaker_default()` or env

  - `AI_LIB_BREAKER_FAILURE_THRESHOLD` (default 5)

  - `AI_LIB_BREAKER_COOLDOWN_SECS` (default 30)

## 📊 Observability: CallStats

If you need per-call stats (latency, retries, request ids, endpoint), use:

```rust

let (resp, stats) = client.call_model_with_stats(unified_req).await?;

println!("client_request_id={}", stats.client_request_id);

```

## 🛑 Cancellable Streaming

```rust

let (mut stream, cancel) = client.chat().messages(messages).stream().execute_stream_with_cancel().await?;

// cancel.cancel(); // emits StreamEnd{finish_reason:"cancelled"}, drops the underlying network stream, and releases inflight permit

```

## 🧾 Optional Feedback (Choice Selection)

Telemetry is **opt-in**. You can inject a `FeedbackSink` and report feedback explicitly:

```rust

use ai_lib_rust::telemetry::{FeedbackEvent, ChoiceSelectionFeedback};

client.report_feedback(FeedbackEvent::ChoiceSelection(ChoiceSelectionFeedback {

    request_id: stats.client_request_id.clone(),

    chosen_index: 0,

    rejected_indices: None,

    latency_to_select_ms: None,

    ui_context: None,

    candidate_hashes: None,

})).await?;

```

## 🎨 Key Features

### Protocol-Driven Architecture

No `match provider` statements. All logic is derived from protocol configuration:

```rust

// The pipeline is built dynamically from protocol manifest

let pipeline = Pipeline::from_manifest(&manifest)?;

// Operators are configured via manifests (YAML/JSON), not hardcoded

// Adding a new provider requires zero code changes

```

### Multi-Candidate Support

Automatically handles multi-candidate scenarios through the `FanOut` operator:

```yaml

streaming:

  candidate:

    candidate_id_path: "$.choices[*].index"

    fan_out: true

```

### Tool Accumulation

Stateful accumulation of tool call arguments:

```yaml

streaming:

  accumulator:

    stateful_tool_parsing: true

    key_path: "$.delta.partial_json"

    flush_on: "$.type == 'content_block_stop'"

```

### Hot Reload

Protocol configurations can be updated at runtime:

```rust

let loader = ProtocolLoader::new().with_hot_reload(true);

// Protocol changes are automatically picked up

```

## 📚 Examples

See the `examples/` directory:

- `basic_usage.rs`: Simple non-streaming chat completion

- `deepseek_chat_stream.rs`: Streaming chat example

- `deepseek_tool_call_stream.rs`: Tool calling with streaming

- `custom_protocol.rs`: Loading custom protocol configurations

- `list_models.rs`: Listing available models from provider

- `service_discovery.rs`: Service discovery and custom service calls

- `test_protocol_loading.rs`: Protocol loading sanity check

## 🧪 Testing

```bash

# Run all tests

cargo test

# Run compliance tests (cross-runtime consistency)

cargo test --test compliance

# Run with all features enabled

cargo test --features full

```

## 📦 Batch (Chat)

For batch execution (order-preserving), use:

```rust

use ai_lib_rust::{AiClient, ChatBatchRequest, Message};

let client = AiClient::new("deepseek/deepseek-chat").await?;

let reqs = vec![

    ChatBatchRequest::new(vec![Message::user("Hello")]),

    ChatBatchRequest::new(vec![Message::user("Explain SSE in one sentence")])

        .temperature(0.2),

];

let results = client.chat_batch(reqs, Some(5)).await;

```

### Smart batch tuning

If you prefer a conservative default heuristic, use:

```rust

let results = client.chat_batch_smart(reqs).await;

```

Override concurrency with:

- `AI_LIB_BATCH_CONCURRENCY`

## 🤝 Contributing

Contributions are welcome! Please ensure that:

1. All protocol configurations follow the AI-Protocol specification (v1.5 / V2)

2. New operators are properly documented

3. Tests are included for new features

4. Compliance tests pass for cross-runtime behaviors (`cargo test --test compliance`)

5. Code follows Rust best practices and passes `cargo clippy`

## 📄 License

This project is licensed under either of:

- Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE))

- MIT License ([LICENSE-MIT](LICENSE-MIT))

at your option.

## 🔗 Related Projects

- [AI-Protocol](https://github.com/hiddenpath/ai-protocol): Protocol specification (v1.5 / V2)

- [ai-lib-python](https://github.com/hiddenpath/ai-lib-python): Python runtime implementation

---

**ai-lib-rust** - Where protocol meets performance. 🚀
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/hiddenpath/ai-lib-rust

Awesome Lists containing this project

README