https://github.com/klipitkas/opik-php

Community-maintained PHP SDK for Opik - an LLM observability and evaluation platform.
https://github.com/klipitkas/opik-php
comet llm observability opik opikphp php8 sdk sdk-php tracing
Last synced: 6 months ago
JSON representation
Community-maintained PHP SDK for Opik - an LLM observability and evaluation platform.
Host: GitHub
URL: https://github.com/klipitkas/opik-php
Owner: klipitkas
License: mit
Created: 2025-11-28T16:09:31.000Z (8 months ago)
Default Branch: main
Last Pushed: 2025-12-19T06:40:14.000Z (7 months ago)
Last Synced: 2025-12-20T03:27:01.406Z (7 months ago)
Topics: comet, llm, observability, opik, opikphp, php8, sdk, sdk-php, tracing
Language: PHP
Homepage: https://packagist.org/packages/klipitkas/opik-php
Size: 6.8 MB
Stars: 2
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project

README

          # Opik PHP SDK

> PHP SDK for [Opik](https://www.comet.com/docs/opik/) - an LLM observability and evaluation platform.

**NOTE**: This is a community-maintained SDK, not an official Comet ML product. For official SDKs, see [Python](https://github.com/comet-ml/opik/tree/main/sdks/python) and [TypeScript](https://github.com/comet-ml/opik/tree/main/sdks/typescript).

## Table of Contents

- [SDK Comparison](#sdk-comparison)

- [Installation](#installation)

- [Quick Start](#quick-start)

- [Configuration](#configuration)

- [Features](#features)

  - [Tracing](#tracing)

  - [Feedback Scores](#feedback-scores)

  - [Threads](#threads)

  - [Datasets](#datasets)

  - [Experiments](#experiments)

  - [Prompts](#prompts)

  - [Attachments](#attachments)

- [API Reference](#api-reference)

- [Development](#development)

---

## SDK Comparison

This table compares feature coverage between the official SDKs and this community PHP SDK.

| Category | Feature | Python | TypeScript | PHP | Notes |

|----------|---------|:------:|:----------:|:---:|-------|

| **Tracing** | Traces & Spans | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |

| | Nested Spans | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |

| | Search (OQL) | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |

| | Span Types | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |

| | Usage Tracking | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |

| | Cost Calculation | :white_check_mark: | :white_check_mark: | :white_check_mark: | User-provided pricing |

| | `@track` Decorator | :white_check_mark: | :white_check_mark: | :x: | PHP lacks decorators |

| **Feedback** | Feedback Scores | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |

| | Batch Feedback | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |

| | Threads | :white_check_mark: | :x: | :white_check_mark: | Full support |

| **Datasets** | CRUD Operations | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |

| | Flexible Schema | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |

| | JSON Import/Export | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |

| **Experiments** | Create & Manage | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |

| | Log Items | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |

| **Prompts** | Text Prompts | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |

| | Chat Prompts | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |

| | Version History | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |

| **Attachments** | Upload/Download | :white_check_mark: | :x: | :white_check_mark: | Full support |

| **Evaluation** | Heuristic Metrics | :white_check_mark: | :white_check_mark: | :white_check_mark: | ExactMatch, Contains, RegexMatch, IsJson, Equals, LevenshteinRatio |

| | LLM Judge Metrics | :white_check_mark: | :white_check_mark: | :x: | Not implemented |

| | `evaluate()` | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |

| **Integrations** | OpenAI | :white_check_mark: | :white_check_mark: | :x: | Not implemented |

| | LangChain | :white_check_mark: | :white_check_mark: | :x: | Not implemented |

| | Other Frameworks | :white_check_mark: | :white_check_mark: | :x: | Not implemented |

| **Advanced** | Guardrails | :white_check_mark: | :x: | :x: | Not implemented |

| | Simulation | :white_check_mark: | :x: | :x: | Not implemented |

| | CLI Commands | :white_check_mark: | :x: | :x: | Not implemented |

### Coverage Summary

| SDK | Core Features | Advanced Features | Overall |

|-----|:-------------:|:-----------------:|:-------:|

| **Python** (Official) | 100% | 100% | 100% |

| **TypeScript** (Official) | ~90% | ~60% | ~80% |

| **PHP** (Community) | ~95% | ~25% | **~75%** |

### What's Missing in PHP SDK

**High Priority (Core Functionality):**

- LLM Judge Metrics (AnswerRelevance, Hallucination, etc.)

**Medium Priority (Integrations):**

- OpenAI integration for automatic tracing

- Other LLM provider integrations

**Low Priority (Advanced):**

- Guardrails (PII detection, topic filtering)

- Simulation framework

- CLI commands

- Local recording for testing

### Contributing

Contributions are welcome! If you'd like to help implement missing features, please see the [Development](#development) section.

---

## Installation

**Requirements:** PHP 8.1+, Composer

```bash

composer require klipitkas/opik-php

```

---

## Quick Start

```php

trace(

    name: 'chat-completion',

    input: ['messages' => [['role' => 'user', 'content' => 'Hello!']]],

);

// Create an LLM span within the trace

$span = $trace->span(name: 'openai-call', type: SpanType::LLM);

$span->update(

    output: ['response' => 'Hi there!'],

    model: 'gpt-4',

    provider: 'openai',

    usage: new \Opik\Tracer\Usage(promptTokens: 10, completionTokens: 5, totalTokens: 15),

);

$span->end();

// End trace and flush

$trace->update(output: ['response' => 'Hi there!']);

$trace->end();

$client->flush();

```

---

## Configuration

### Environment Variables

| Variable | Description | Required | Default |

|----------|-------------|----------|---------|

| `OPIK_API_KEY` | API key | Yes (cloud) | - |

| `OPIK_WORKSPACE` | Workspace name | Yes (cloud) | - |

| `OPIK_PROJECT_NAME` | Project name | No | `Default Project` |

| `OPIK_URL_OVERRIDE` | Custom API URL | No | - |

| `OPIK_DEBUG` | Enable debug mode | No | `false` |

| `OPIK_ENABLE_COMPRESSION` | Enable gzip compression | No | `true` |

### Setup Methods

```bash

# Cloud (recommended)

export OPIK_API_KEY=your-api-key

export OPIK_WORKSPACE=your-workspace

export OPIK_PROJECT_NAME=your-project-name

```

```php

// From environment (recommended)

$client = new OpikClient();

// Explicit parameters

$client = new OpikClient(

    apiKey: 'your-api-key',

    workspace: 'your-workspace',

    projectName: 'my-project',

);

// Local development

$client = new OpikClient(baseUrl: 'http://localhost:5173/api/');

// Verify credentials

if ($client->authCheck()) {

    echo "Connected!";

}

```

---

## Features

### Tracing

#### Basic Trace with Spans

```php

$trace = $client->trace(name: 'my-trace', input: ['query' => 'Hello']);

$span = $trace->span(name: 'process', type: SpanType::LLM);

$span->update(output: ['result' => 'Done']);

$span->end();

$trace->end();

$client->flush();

```

#### Nested Spans

```php

$trace = $client->trace(name: 'multi-step');

$parent = $trace->span(name: 'parent');

$child1 = $parent->span(name: 'step-1', type: SpanType::TOOL);

$child1->end();

$child2 = $parent->span(name: 'step-2', type: SpanType::LLM);

$child2->end();

$parent->end();

$trace->end();

```

#### Search Traces and Spans

```php

// Search traces with OQL filter

$traces = $client->searchTraces(

    projectName: 'my-project',

    filter: 'name = "chat-completion"',

);

// Get specific trace/span

$trace = $client->getTraceContent('trace-id');

$span = $client->getSpanContent('span-id');

```

#### Span Types

| Type | Description |

|------|-------------|

| `SpanType::GENERAL` | General purpose span |

| `SpanType::LLM` | LLM API call |

| `SpanType::TOOL` | Tool/function call |

| `SpanType::GUARDRAIL` | Guardrail check |

#### Cost Calculation

Calculate and track LLM costs using your own pricing:

```php

use Opik\Cost\CostCalculator;

use Opik\Tracer\Usage;

$usage = new Usage(promptTokens: 1000, completionTokens: 500);

// Using per-million token pricing (common format)

$cost = CostCalculator::calculateFromMillionPricing(

    $usage,

    inputCostPerMillion: 2.50,   // $2.50 per 1M input tokens

    outputCostPerMillion: 10.00, // $10.00 per 1M output tokens

);

// Or using per-token pricing

$cost = CostCalculator::calculate(

    $usage,

    inputCostPerToken: 0.0000025,

    outputCostPerToken: 0.00001,

);

// Attach cost to span

$span->update(totalCost: $cost);

```

---

### Feedback Scores

#### On Traces and Spans

```php

$trace = $client->trace(name: 'scored-trace');

// Numeric score

$trace->logFeedbackScore(name: 'relevance', value: 0.95, reason: 'Good answer');

// Categorical score

$span = $trace->span(name: 'llm-call', type: SpanType::LLM);

$span->logFeedbackScore(name: 'sentiment', value: 1.0, categoryName: 'positive');

```

#### Batch Feedback Scores

```php

use Opik\Feedback\FeedbackScore;

// For traces

$client->logTracesFeedbackScores([

    FeedbackScore::forTrace('trace-1', 'quality', value: 0.9),

    FeedbackScore::forTrace('trace-2', 'quality', value: 0.85, reason: 'Good'),

]);

// For spans

$client->logSpansFeedbackScores([

    FeedbackScore::forSpan('span-1', 'accuracy', value: 0.95),

    FeedbackScore::forSpan('span-2', 'accuracy', categoryName: 'high'),

]);

// Delete feedback scores

$client->deleteTraceFeedbackScore('trace-id', 'quality');

$client->deleteSpanFeedbackScore('span-id', 'accuracy');

```

---

### Threads

Group related traces into conversations:

```php

use Opik\Feedback\FeedbackScore;

// Create traces in a thread

$trace1 = $client->trace(name: 'user-msg-1', threadId: 'conversation-123');

$trace1->end();

$trace2 = $client->trace(name: 'user-msg-2', threadId: 'conversation-123');

$trace2->end();

$client->flush();

// Close thread before scoring

$client->closeThread('conversation-123');

// Score the thread

$client->logThreadsFeedbackScores([

    FeedbackScore::forThread('conversation-123', 'satisfaction', value: 0.95),

]);

```

---

### Datasets

#### Create and Populate

```php

use Opik\Dataset\DatasetItem;

$dataset = $client->getOrCreateDataset(

    name: 'eval-dataset',

    description: 'Test cases',

);

// Standard schema

$dataset->insert([

    new DatasetItem(

        input: ['question' => 'What is PHP?'],

        expectedOutput: ['answer' => 'A programming language'],

        metadata: ['difficulty' => 'easy'],

    ),

]);

// Flexible schema

$dataset->insert([

    new DatasetItem(data: [

        'prompt' => 'Translate: Hello',

        'expected' => 'Bonjour',

    ]),

]);

```

#### Read and Manage

```php

// Get items

$items = $dataset->getItems(page: 1, size: 100);

foreach ($items as $item) {

    $input = $item->getInput();

    $output = $item->getExpectedOutput();

}

// Update/delete

$dataset->update($items);

$dataset->delete(['item-id-1', 'item-id-2']);

$dataset->clear(); // Delete all

// List/delete datasets

$datasets = $client->getDatasets();

$client->deleteDataset('dataset-name');

```

#### JSON Import/Export

```php

// Import from JSON string

$json = '[{"input": "question 1", "output": "answer 1"}, {"input": "question 2", "output": "answer 2"}]';

$dataset->insertFromJson($json);

// Import with key mapping (rename keys)

$json = '[{"Question": "What is PHP?", "Expected Answer": "A language"}]';

$dataset->insertFromJson($json, keysMapping: [

    'Question' => 'input',

    'Expected Answer' => 'expected_output',

]);

// Import while ignoring certain keys

$dataset->insertFromJson($json, ignoreKeys: ['internal_id', 'debug_info']);

// Export to JSON string

$json = $dataset->toJson();

// Export with key mapping

$json = $dataset->toJson(keysMapping: [

    'input' => 'Question',

    'expected_output' => 'Expected Answer',

]);

```

---

### Experiments

```php

use Opik\Experiment\ExperimentItem;

// Create experiment

$experiment = $client->createExperiment(

    name: 'gpt-4-eval',

    datasetName: 'eval-dataset',

);

// Log results

$experiment->logItems([

    new ExperimentItem(

        datasetItemId: 'item-1',

        traceId: 'trace-1',

        output: ['result' => 'Answer'],

        feedbackScores: [['name' => 'accuracy', 'value' => 0.9]],

    ),

]);

// Manage experiments

$experiment = $client->getExperimentById('experiment-id');

$client->updateExperiment(id: 'experiment-id', name: 'new-name');

$client->deleteExperiment('experiment-name');

```

---

### Prompts

Opik supports two types of prompts: **text prompts** (simple string templates) and **chat prompts** (array of messages following OpenAI's chat format).

#### Text Prompts

```php

// Create a text prompt

$prompt = $client->createPrompt(

    name: 'greeting',

    template: 'Hello {{name}}, you asked: {{question}}',

);

// Get and format

$prompt = $client->getPrompt('greeting');

$text = $prompt->format(['name' => 'John', 'question' => 'How are you?']);

// Returns: "Hello John, you asked: How are you?"

```

#### Chat Prompts

```php

use Opik\Prompt\ChatMessage;

// Create a chat prompt with messages array

$prompt = $client->createPrompt(

    name: 'assistant-prompt',

    template: [

        ChatMessage::system('You are a helpful assistant specializing in {{domain}}.'),

        ChatMessage::user('{{question}}'),

    ],

);

// Format returns array of messages

$messages = $prompt->format(['domain' => 'physics', 'question' => 'What is gravity?']);

// Returns:

// [

//     ['role' => 'system', 'content' => 'You are a helpful assistant specializing in physics.'],

//     ['role' => 'user', 'content' => 'What is gravity?'],

// ]

```

#### ChatMessage Factory Methods

| Method | Description |

|--------|-------------|

| `ChatMessage::system($content)` | Create a system message |

| `ChatMessage::user($content)` | Create a user message |

| `ChatMessage::assistant($content)` | Create an assistant message |

| `ChatMessage::tool($content)` | Create a tool message |

#### Prompt Versions

```php

// Get version history

$history = $client->getPromptHistory('greeting');

// Get specific version

$version = $prompt->getVersion('commit-hash');

// Check prompt type

if ($version->isChat()) {

    $messages = $version->format($variables);

} else {

    $text = $version->format($variables);

}

```

#### Delete Prompts

```php

$client->deletePrompts(['prompt-id-1', 'prompt-id-2']);

```

---

### Attachments

Upload files to traces or spans:

```php

use Opik\Attachment\AttachmentEntityType;

$attachmentClient = $client->getAttachmentClient();

// Upload

$attachmentClient->uploadAttachment(

    projectName: 'my-project',

    entityType: AttachmentEntityType::TRACE,

    entityId: $trace->getId(),

    filePath: '/path/to/file.pdf',

);

// List

$attachments = $attachmentClient->getAttachmentList(

    projectName: 'my-project',

    entityType: AttachmentEntityType::TRACE,

    entityId: $trace->getId(),

);

// Download

$content = $attachmentClient->downloadAttachment(

    projectName: 'my-project',

    entityType: AttachmentEntityType::TRACE,

    entityId: $trace->getId(),

    fileName: 'file.pdf',

    mimeType: 'application/pdf',

);

```

---

### Evaluation Metrics

The SDK provides heuristic metrics for evaluating LLM outputs:

```php

use Opik\Evaluation\Metrics\ExactMatch;

use Opik\Evaluation\Metrics\Contains;

use Opik\Evaluation\Metrics\RegexMatch;

use Opik\Evaluation\Metrics\IsJson;

// ExactMatch - checks for exact equality

$metric = new ExactMatch();

$result = $metric->score([

    'output' => 'hello world',

    'expected' => 'hello world',

]);

echo $result->value; // 1.0 (match) or 0.0 (no match)

// Contains - checks if output contains expected substring

$metric = new Contains(caseSensitive: false);

$result = $metric->score([

    'output' => 'Hello World',

    'expected' => 'hello',

]);

echo $result->value; // 1.0

// RegexMatch - checks if output matches a regex pattern

$metric = new RegexMatch();

$result = $metric->score([

    'output' => 'Contact: test@example.com',

    'pattern' => '/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/',

]);

echo $result->value; // 1.0

// IsJson - checks if output is valid JSON

$metric = new IsJson();

$result = $metric->score([

    'output' => '{"key": "value"}',

]);

echo $result->value; // 1.0

```

#### Available Metrics

| Metric | Description |

|--------|-------------|

| `ExactMatch` | Checks if output exactly equals expected (strict comparison) |

| `Contains` | Checks if output contains expected substring (supports case-insensitive) |

| `RegexMatch` | Checks if output matches a regex pattern |

| `IsJson` | Checks if output is valid JSON |

#### Evaluation Function

Run evaluations against datasets with automatic experiment tracking:

```php

use Opik\Evaluation\Metrics\ExactMatch;

use Opik\Evaluation\Metrics\Contains;

// Get or create a dataset

$dataset = $client->getOrCreateDataset('qa-dataset');

$dataset->insert([

    new DatasetItem(data: [

        'input' => 'What is PHP?',

        'expected' => 'programming language',

    ]),

    new DatasetItem(data: [

        'input' => 'What is Python?',

        'expected' => 'programming language',

    ]),

]);

// Define your task function

$task = function (array $item): array {

    // Your LLM call or processing logic here

    $response = $llm->complete($item['input']);

    return ['output' => $response];

};

// Run evaluation

$result = $client->evaluate(

    dataset: $dataset,

    task: $task,

    scoringMetrics: [

        new ExactMatch(),

        new Contains(),

    ],

    experimentName: 'my-evaluation',

);

// Access results

echo "Evaluated {$result->count()} items in {$result->durationSeconds}s\n";

echo "Average exact_match: {$result->getAverageScore('exact_match')}\n";

echo "Average contains: {$result->getAverageScore('contains')}\n";

// Get all average scores

$averages = $result->getAverageScores();

foreach ($averages as $metric => $score) {

    echo "{$metric}: {$score}\n";

}

```

The `evaluate()` function:

- Creates an experiment for tracking results

- Runs the task function on each dataset item

- Calculates scores using the provided metrics

- Logs feedback scores to traces

- Returns detailed results with averages

---

## API Reference

### OpikClient Methods

| Category | Method | Description |

|----------|--------|-------------|

| **Tracing** | `trace(...)` | Create a trace |

| | `span(...)` | Create a standalone span |

| | `searchTraces(...)` | Search traces with OQL |

| | `searchSpans(...)` | Search spans with OQL |

| | `getTraceContent(id)` | Get trace by ID |

| | `getSpanContent(id)` | Get span by ID |

| **Feedback** | `logTracesFeedbackScores(scores)` | Batch log trace scores |

| | `logSpansFeedbackScores(scores)` | Batch log span scores |

| | `logThreadsFeedbackScores(scores)` | Batch log thread scores |

| | `deleteTraceFeedbackScore(id, name)` | Delete trace score |

| | `deleteSpanFeedbackScore(id, name)` | Delete span score |

| **Threads** | `closeThread(id)` | Close a thread |

| | `closeThreads(ids)` | Close multiple threads |

| **Datasets** | `getDataset(name)` | Get dataset |

| | `getDatasets()` | List datasets |

| | `createDataset(name)` | Create dataset |

| | `getOrCreateDataset(name)` | Get or create dataset |

| | `deleteDataset(name)` | Delete dataset |

| **Experiments** | `createExperiment(name, datasetName)` | Create experiment |

| | `getExperiment(name)` | Get by name |

| | `getExperimentById(id)` | Get by ID |

| | `updateExperiment(id, ...)` | Update experiment |

| | `deleteExperiment(name)` | Delete experiment |

| **Prompts** | `createPrompt(name, template)` | Create text or chat prompt |

| | `getPrompt(name)` | Get prompt |

| | `getPrompts()` | List prompts |

| | `getPromptHistory(name)` | Get versions |

| | `deletePrompts(ids)` | Delete prompts |

| **Attachments** | `getAttachmentClient()` | Get attachment client |

| **Evaluation** | `evaluate(dataset, task, ...)` | Run evaluation with metrics |

| **Utilities** | `authCheck()` | Verify credentials |

| | `flush()` | Send pending data |

| | `getConfig()` | Get configuration |

| | `getProjectUrl()` | Get project URL |

### Trace Methods

| Method | Description |

|--------|-------------|

| `span(name, type?, ...)` | Create child span |

| `update(output?, ...)` | Update trace data |

| `end()` | End the trace |

| `logFeedbackScore(name, value, ...)` | Log feedback score |

| `getId()` | Get trace ID |

### Span Methods

| Method | Description |

|--------|-------------|

| `span(name, type?, ...)` | Create child span |

| `update(output?, model?, usage?, ...)` | Update span data |

| `end()` | End the span |

| `logFeedbackScore(name, value, ...)` | Log feedback score |

| `getId()` | Get span ID |

---

## Development

```bash

# Install dependencies

composer install

# Run tests

composer test

# Run with coverage (requires pcov/xdebug)

composer test:coverage

# Static analysis

composer analyse

# Code formatting

composer format

composer format:check

```

---

## License

MIT

## Trademarks

Opik and Comet ML are trademarks of Comet ML, Inc. This project is not affiliated with, endorsed by, or sponsored by Comet ML, Inc.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/klipitkas/opik-php

Awesome Lists containing this project

README