{"id":36317126,"url":"https://github.com/klipitkas/opik-php","last_synced_at":"2026-01-11T11:03:58.642Z","repository":{"id":328963376,"uuid":"1106093922","full_name":"klipitkas/opik-php","owner":"klipitkas","description":"Community-maintained PHP SDK for Opik - an LLM observability and evaluation platform.","archived":false,"fork":false,"pushed_at":"2025-12-19T06:40:14.000Z","size":7132,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-12-20T03:27:01.406Z","etag":null,"topics":["comet","llm","observability","opik","opikphp","php8","sdk","sdk-php","tracing"],"latest_commit_sha":null,"homepage":"https://packagist.org/packages/klipitkas/opik-php","language":"PHP","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/klipitkas.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-11-28T16:09:31.000Z","updated_at":"2025-12-19T11:09:14.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/klipitkas/opik-php","commit_stats":null,"previous_names":["klipitkas/opik-php"],"tags_count":17,"template":false,"template_full_name":null,"purl":"pkg:github/klipitkas/opik-php","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/klipitkas%2Fopik-php","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/klipitkas%2Fopik-php/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/klipitkas%2Fopik-php/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/klipitkas%2Fopik-php/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/klipitkas","download_url":"https://codeload.github.com/klipitkas/opik-php/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/klipitkas%2Fopik-php/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28301411,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-11T08:21:30.231Z","status":"ssl_error","status_checked_at":"2026-01-11T08:21:26.882Z","response_time":60,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["comet","llm","observability","opik","opikphp","php8","sdk","sdk-php","tracing"],"created_at":"2026-01-11T11:03:57.861Z","updated_at":"2026-01-11T11:03:58.627Z","avatar_url":"https://github.com/klipitkas.png","language":"PHP","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Opik PHP SDK\n\n\u003e PHP SDK for [Opik](https://www.comet.com/docs/opik/) - an LLM observability and evaluation platform.\n\n**NOTE**: This is a community-maintained SDK, not an official Comet ML product. For official SDKs, see [Python](https://github.com/comet-ml/opik/tree/main/sdks/python) and [TypeScript](https://github.com/comet-ml/opik/tree/main/sdks/typescript).\n\n## Table of Contents\n\n- [SDK Comparison](#sdk-comparison)\n- [Installation](#installation)\n- [Quick Start](#quick-start)\n- [Configuration](#configuration)\n- [Features](#features)\n  - [Tracing](#tracing)\n  - [Feedback Scores](#feedback-scores)\n  - [Threads](#threads)\n  - [Datasets](#datasets)\n  - [Experiments](#experiments)\n  - [Prompts](#prompts)\n  - [Attachments](#attachments)\n- [API Reference](#api-reference)\n- [Development](#development)\n\n---\n\n## SDK Comparison\n\nThis table compares feature coverage between the official SDKs and this community PHP SDK.\n\n| Category | Feature | Python | TypeScript | PHP | Notes |\n|----------|---------|:------:|:----------:|:---:|-------|\n| **Tracing** | Traces \u0026 Spans | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |\n| | Nested Spans | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |\n| | Search (OQL) | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |\n| | Span Types | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |\n| | Usage Tracking | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |\n| | Cost Calculation | :white_check_mark: | :white_check_mark: | :white_check_mark: | User-provided pricing |\n| | `@track` Decorator | :white_check_mark: | :white_check_mark: | :x: | PHP lacks decorators |\n| **Feedback** | Feedback Scores | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |\n| | Batch Feedback | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |\n| | Threads | :white_check_mark: | :x: | :white_check_mark: | Full support |\n| **Datasets** | CRUD Operations | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |\n| | Flexible Schema | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |\n| | JSON Import/Export | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |\n| **Experiments** | Create \u0026 Manage | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |\n| | Log Items | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |\n| **Prompts** | Text Prompts | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |\n| | Chat Prompts | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |\n| | Version History | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |\n| **Attachments** | Upload/Download | :white_check_mark: | :x: | :white_check_mark: | Full support |\n| **Evaluation** | Heuristic Metrics | :white_check_mark: | :white_check_mark: | :white_check_mark: | ExactMatch, Contains, RegexMatch, IsJson, Equals, LevenshteinRatio |\n| | LLM Judge Metrics | :white_check_mark: | :white_check_mark: | :x: | Not implemented |\n| | `evaluate()` | :white_check_mark: | :white_check_mark: | :white_check_mark: | Full support |\n| **Integrations** | OpenAI | :white_check_mark: | :white_check_mark: | :x: | Not implemented |\n| | LangChain | :white_check_mark: | :white_check_mark: | :x: | Not implemented |\n| | Other Frameworks | :white_check_mark: | :white_check_mark: | :x: | Not implemented |\n| **Advanced** | Guardrails | :white_check_mark: | :x: | :x: | Not implemented |\n| | Simulation | :white_check_mark: | :x: | :x: | Not implemented |\n| | CLI Commands | :white_check_mark: | :x: | :x: | Not implemented |\n\n### Coverage Summary\n\n| SDK | Core Features | Advanced Features | Overall |\n|-----|:-------------:|:-----------------:|:-------:|\n| **Python** (Official) | 100% | 100% | 100% |\n| **TypeScript** (Official) | ~90% | ~60% | ~80% |\n| **PHP** (Community) | ~95% | ~25% | **~75%** |\n\n### What's Missing in PHP SDK\n\n**High Priority (Core Functionality):**\n\n- LLM Judge Metrics (AnswerRelevance, Hallucination, etc.)\n\n**Medium Priority (Integrations):**\n\n- OpenAI integration for automatic tracing\n- Other LLM provider integrations\n\n**Low Priority (Advanced):**\n\n- Guardrails (PII detection, topic filtering)\n- Simulation framework\n- CLI commands\n- Local recording for testing\n\n### Contributing\n\nContributions are welcome! If you'd like to help implement missing features, please see the [Development](#development) section.\n\n---\n\n## Installation\n\n**Requirements:** PHP 8.1+, Composer\n\n```bash\ncomposer require klipitkas/opik-php\n```\n\n---\n\n## Quick Start\n\n```php\n\u003c?php\n\nuse Opik\\OpikClient;\nuse Opik\\Tracer\\SpanType;\n\n$client = new OpikClient();\n\n// Create a trace\n$trace = $client-\u003etrace(\n    name: 'chat-completion',\n    input: ['messages' =\u003e [['role' =\u003e 'user', 'content' =\u003e 'Hello!']]],\n);\n\n// Create an LLM span within the trace\n$span = $trace-\u003espan(name: 'openai-call', type: SpanType::LLM);\n$span-\u003eupdate(\n    output: ['response' =\u003e 'Hi there!'],\n    model: 'gpt-4',\n    provider: 'openai',\n    usage: new \\Opik\\Tracer\\Usage(promptTokens: 10, completionTokens: 5, totalTokens: 15),\n);\n$span-\u003eend();\n\n// End trace and flush\n$trace-\u003eupdate(output: ['response' =\u003e 'Hi there!']);\n$trace-\u003eend();\n$client-\u003eflush();\n```\n\n---\n\n## Configuration\n\n### Environment Variables\n\n| Variable | Description | Required | Default |\n|----------|-------------|----------|---------|\n| `OPIK_API_KEY` | API key | Yes (cloud) | - |\n| `OPIK_WORKSPACE` | Workspace name | Yes (cloud) | - |\n| `OPIK_PROJECT_NAME` | Project name | No | `Default Project` |\n| `OPIK_URL_OVERRIDE` | Custom API URL | No | - |\n| `OPIK_DEBUG` | Enable debug mode | No | `false` |\n| `OPIK_ENABLE_COMPRESSION` | Enable gzip compression | No | `true` |\n\n### Setup Methods\n\n```bash\n# Cloud (recommended)\nexport OPIK_API_KEY=your-api-key\nexport OPIK_WORKSPACE=your-workspace\nexport OPIK_PROJECT_NAME=your-project-name\n```\n\n```php\n// From environment (recommended)\n$client = new OpikClient();\n\n// Explicit parameters\n$client = new OpikClient(\n    apiKey: 'your-api-key',\n    workspace: 'your-workspace',\n    projectName: 'my-project',\n);\n\n// Local development\n$client = new OpikClient(baseUrl: 'http://localhost:5173/api/');\n\n// Verify credentials\nif ($client-\u003eauthCheck()) {\n    echo \"Connected!\";\n}\n```\n\n---\n\n## Features\n\n### Tracing\n\n#### Basic Trace with Spans\n\n```php\n$trace = $client-\u003etrace(name: 'my-trace', input: ['query' =\u003e 'Hello']);\n\n$span = $trace-\u003espan(name: 'process', type: SpanType::LLM);\n$span-\u003eupdate(output: ['result' =\u003e 'Done']);\n$span-\u003eend();\n\n$trace-\u003eend();\n$client-\u003eflush();\n```\n\n#### Nested Spans\n\n```php\n$trace = $client-\u003etrace(name: 'multi-step');\n$parent = $trace-\u003espan(name: 'parent');\n\n$child1 = $parent-\u003espan(name: 'step-1', type: SpanType::TOOL);\n$child1-\u003eend();\n\n$child2 = $parent-\u003espan(name: 'step-2', type: SpanType::LLM);\n$child2-\u003eend();\n\n$parent-\u003eend();\n$trace-\u003eend();\n```\n\n#### Search Traces and Spans\n\n```php\n// Search traces with OQL filter\n$traces = $client-\u003esearchTraces(\n    projectName: 'my-project',\n    filter: 'name = \"chat-completion\"',\n);\n\n// Get specific trace/span\n$trace = $client-\u003egetTraceContent('trace-id');\n$span = $client-\u003egetSpanContent('span-id');\n```\n\n#### Span Types\n\n| Type | Description |\n|------|-------------|\n| `SpanType::GENERAL` | General purpose span |\n| `SpanType::LLM` | LLM API call |\n| `SpanType::TOOL` | Tool/function call |\n| `SpanType::GUARDRAIL` | Guardrail check |\n\n#### Cost Calculation\n\nCalculate and track LLM costs using your own pricing:\n\n```php\nuse Opik\\Cost\\CostCalculator;\nuse Opik\\Tracer\\Usage;\n\n$usage = new Usage(promptTokens: 1000, completionTokens: 500);\n\n// Using per-million token pricing (common format)\n$cost = CostCalculator::calculateFromMillionPricing(\n    $usage,\n    inputCostPerMillion: 2.50,   // $2.50 per 1M input tokens\n    outputCostPerMillion: 10.00, // $10.00 per 1M output tokens\n);\n\n// Or using per-token pricing\n$cost = CostCalculator::calculate(\n    $usage,\n    inputCostPerToken: 0.0000025,\n    outputCostPerToken: 0.00001,\n);\n\n// Attach cost to span\n$span-\u003eupdate(totalCost: $cost);\n```\n\n---\n\n### Feedback Scores\n\n#### On Traces and Spans\n\n```php\n$trace = $client-\u003etrace(name: 'scored-trace');\n\n// Numeric score\n$trace-\u003elogFeedbackScore(name: 'relevance', value: 0.95, reason: 'Good answer');\n\n// Categorical score\n$span = $trace-\u003espan(name: 'llm-call', type: SpanType::LLM);\n$span-\u003elogFeedbackScore(name: 'sentiment', value: 1.0, categoryName: 'positive');\n```\n\n#### Batch Feedback Scores\n\n```php\nuse Opik\\Feedback\\FeedbackScore;\n\n// For traces\n$client-\u003elogTracesFeedbackScores([\n    FeedbackScore::forTrace('trace-1', 'quality', value: 0.9),\n    FeedbackScore::forTrace('trace-2', 'quality', value: 0.85, reason: 'Good'),\n]);\n\n// For spans\n$client-\u003elogSpansFeedbackScores([\n    FeedbackScore::forSpan('span-1', 'accuracy', value: 0.95),\n    FeedbackScore::forSpan('span-2', 'accuracy', categoryName: 'high'),\n]);\n\n// Delete feedback scores\n$client-\u003edeleteTraceFeedbackScore('trace-id', 'quality');\n$client-\u003edeleteSpanFeedbackScore('span-id', 'accuracy');\n```\n\n---\n\n### Threads\n\nGroup related traces into conversations:\n\n```php\nuse Opik\\Feedback\\FeedbackScore;\n\n// Create traces in a thread\n$trace1 = $client-\u003etrace(name: 'user-msg-1', threadId: 'conversation-123');\n$trace1-\u003eend();\n\n$trace2 = $client-\u003etrace(name: 'user-msg-2', threadId: 'conversation-123');\n$trace2-\u003eend();\n$client-\u003eflush();\n\n// Close thread before scoring\n$client-\u003ecloseThread('conversation-123');\n\n// Score the thread\n$client-\u003elogThreadsFeedbackScores([\n    FeedbackScore::forThread('conversation-123', 'satisfaction', value: 0.95),\n]);\n```\n\n---\n\n### Datasets\n\n#### Create and Populate\n\n```php\nuse Opik\\Dataset\\DatasetItem;\n\n$dataset = $client-\u003egetOrCreateDataset(\n    name: 'eval-dataset',\n    description: 'Test cases',\n);\n\n// Standard schema\n$dataset-\u003einsert([\n    new DatasetItem(\n        input: ['question' =\u003e 'What is PHP?'],\n        expectedOutput: ['answer' =\u003e 'A programming language'],\n        metadata: ['difficulty' =\u003e 'easy'],\n    ),\n]);\n\n// Flexible schema\n$dataset-\u003einsert([\n    new DatasetItem(data: [\n        'prompt' =\u003e 'Translate: Hello',\n        'expected' =\u003e 'Bonjour',\n    ]),\n]);\n```\n\n#### Read and Manage\n\n```php\n// Get items\n$items = $dataset-\u003egetItems(page: 1, size: 100);\nforeach ($items as $item) {\n    $input = $item-\u003egetInput();\n    $output = $item-\u003egetExpectedOutput();\n}\n\n// Update/delete\n$dataset-\u003eupdate($items);\n$dataset-\u003edelete(['item-id-1', 'item-id-2']);\n$dataset-\u003eclear(); // Delete all\n\n// List/delete datasets\n$datasets = $client-\u003egetDatasets();\n$client-\u003edeleteDataset('dataset-name');\n```\n\n#### JSON Import/Export\n\n```php\n// Import from JSON string\n$json = '[{\"input\": \"question 1\", \"output\": \"answer 1\"}, {\"input\": \"question 2\", \"output\": \"answer 2\"}]';\n$dataset-\u003einsertFromJson($json);\n\n// Import with key mapping (rename keys)\n$json = '[{\"Question\": \"What is PHP?\", \"Expected Answer\": \"A language\"}]';\n$dataset-\u003einsertFromJson($json, keysMapping: [\n    'Question' =\u003e 'input',\n    'Expected Answer' =\u003e 'expected_output',\n]);\n\n// Import while ignoring certain keys\n$dataset-\u003einsertFromJson($json, ignoreKeys: ['internal_id', 'debug_info']);\n\n// Export to JSON string\n$json = $dataset-\u003etoJson();\n\n// Export with key mapping\n$json = $dataset-\u003etoJson(keysMapping: [\n    'input' =\u003e 'Question',\n    'expected_output' =\u003e 'Expected Answer',\n]);\n```\n\n---\n\n### Experiments\n\n```php\nuse Opik\\Experiment\\ExperimentItem;\n\n// Create experiment\n$experiment = $client-\u003ecreateExperiment(\n    name: 'gpt-4-eval',\n    datasetName: 'eval-dataset',\n);\n\n// Log results\n$experiment-\u003elogItems([\n    new ExperimentItem(\n        datasetItemId: 'item-1',\n        traceId: 'trace-1',\n        output: ['result' =\u003e 'Answer'],\n        feedbackScores: [['name' =\u003e 'accuracy', 'value' =\u003e 0.9]],\n    ),\n]);\n\n// Manage experiments\n$experiment = $client-\u003egetExperimentById('experiment-id');\n$client-\u003eupdateExperiment(id: 'experiment-id', name: 'new-name');\n$client-\u003edeleteExperiment('experiment-name');\n```\n\n---\n\n### Prompts\n\nOpik supports two types of prompts: **text prompts** (simple string templates) and **chat prompts** (array of messages following OpenAI's chat format).\n\n#### Text Prompts\n\n```php\n// Create a text prompt\n$prompt = $client-\u003ecreatePrompt(\n    name: 'greeting',\n    template: 'Hello {{name}}, you asked: {{question}}',\n);\n\n// Get and format\n$prompt = $client-\u003egetPrompt('greeting');\n$text = $prompt-\u003eformat(['name' =\u003e 'John', 'question' =\u003e 'How are you?']);\n// Returns: \"Hello John, you asked: How are you?\"\n```\n\n#### Chat Prompts\n\n```php\nuse Opik\\Prompt\\ChatMessage;\n\n// Create a chat prompt with messages array\n$prompt = $client-\u003ecreatePrompt(\n    name: 'assistant-prompt',\n    template: [\n        ChatMessage::system('You are a helpful assistant specializing in {{domain}}.'),\n        ChatMessage::user('{{question}}'),\n    ],\n);\n\n// Format returns array of messages\n$messages = $prompt-\u003eformat(['domain' =\u003e 'physics', 'question' =\u003e 'What is gravity?']);\n// Returns:\n// [\n//     ['role' =\u003e 'system', 'content' =\u003e 'You are a helpful assistant specializing in physics.'],\n//     ['role' =\u003e 'user', 'content' =\u003e 'What is gravity?'],\n// ]\n```\n\n#### ChatMessage Factory Methods\n\n| Method | Description |\n|--------|-------------|\n| `ChatMessage::system($content)` | Create a system message |\n| `ChatMessage::user($content)` | Create a user message |\n| `ChatMessage::assistant($content)` | Create an assistant message |\n| `ChatMessage::tool($content)` | Create a tool message |\n\n#### Prompt Versions\n\n```php\n// Get version history\n$history = $client-\u003egetPromptHistory('greeting');\n\n// Get specific version\n$version = $prompt-\u003egetVersion('commit-hash');\n\n// Check prompt type\nif ($version-\u003eisChat()) {\n    $messages = $version-\u003eformat($variables);\n} else {\n    $text = $version-\u003eformat($variables);\n}\n```\n\n#### Delete Prompts\n\n```php\n$client-\u003edeletePrompts(['prompt-id-1', 'prompt-id-2']);\n```\n\n---\n\n### Attachments\n\nUpload files to traces or spans:\n\n```php\nuse Opik\\Attachment\\AttachmentEntityType;\n\n$attachmentClient = $client-\u003egetAttachmentClient();\n\n// Upload\n$attachmentClient-\u003euploadAttachment(\n    projectName: 'my-project',\n    entityType: AttachmentEntityType::TRACE,\n    entityId: $trace-\u003egetId(),\n    filePath: '/path/to/file.pdf',\n);\n\n// List\n$attachments = $attachmentClient-\u003egetAttachmentList(\n    projectName: 'my-project',\n    entityType: AttachmentEntityType::TRACE,\n    entityId: $trace-\u003egetId(),\n);\n\n// Download\n$content = $attachmentClient-\u003edownloadAttachment(\n    projectName: 'my-project',\n    entityType: AttachmentEntityType::TRACE,\n    entityId: $trace-\u003egetId(),\n    fileName: 'file.pdf',\n    mimeType: 'application/pdf',\n);\n```\n\n---\n\n### Evaluation Metrics\n\nThe SDK provides heuristic metrics for evaluating LLM outputs:\n\n```php\nuse Opik\\Evaluation\\Metrics\\ExactMatch;\nuse Opik\\Evaluation\\Metrics\\Contains;\nuse Opik\\Evaluation\\Metrics\\RegexMatch;\nuse Opik\\Evaluation\\Metrics\\IsJson;\n\n// ExactMatch - checks for exact equality\n$metric = new ExactMatch();\n$result = $metric-\u003escore([\n    'output' =\u003e 'hello world',\n    'expected' =\u003e 'hello world',\n]);\necho $result-\u003evalue; // 1.0 (match) or 0.0 (no match)\n\n// Contains - checks if output contains expected substring\n$metric = new Contains(caseSensitive: false);\n$result = $metric-\u003escore([\n    'output' =\u003e 'Hello World',\n    'expected' =\u003e 'hello',\n]);\necho $result-\u003evalue; // 1.0\n\n// RegexMatch - checks if output matches a regex pattern\n$metric = new RegexMatch();\n$result = $metric-\u003escore([\n    'output' =\u003e 'Contact: test@example.com',\n    'pattern' =\u003e '/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}/',\n]);\necho $result-\u003evalue; // 1.0\n\n// IsJson - checks if output is valid JSON\n$metric = new IsJson();\n$result = $metric-\u003escore([\n    'output' =\u003e '{\"key\": \"value\"}',\n]);\necho $result-\u003evalue; // 1.0\n```\n\n#### Available Metrics\n\n| Metric | Description |\n|--------|-------------|\n| `ExactMatch` | Checks if output exactly equals expected (strict comparison) |\n| `Contains` | Checks if output contains expected substring (supports case-insensitive) |\n| `RegexMatch` | Checks if output matches a regex pattern |\n| `IsJson` | Checks if output is valid JSON |\n\n#### Evaluation Function\n\nRun evaluations against datasets with automatic experiment tracking:\n\n```php\nuse Opik\\Evaluation\\Metrics\\ExactMatch;\nuse Opik\\Evaluation\\Metrics\\Contains;\n\n// Get or create a dataset\n$dataset = $client-\u003egetOrCreateDataset('qa-dataset');\n$dataset-\u003einsert([\n    new DatasetItem(data: [\n        'input' =\u003e 'What is PHP?',\n        'expected' =\u003e 'programming language',\n    ]),\n    new DatasetItem(data: [\n        'input' =\u003e 'What is Python?',\n        'expected' =\u003e 'programming language',\n    ]),\n]);\n\n// Define your task function\n$task = function (array $item): array {\n    // Your LLM call or processing logic here\n    $response = $llm-\u003ecomplete($item['input']);\n    return ['output' =\u003e $response];\n};\n\n// Run evaluation\n$result = $client-\u003eevaluate(\n    dataset: $dataset,\n    task: $task,\n    scoringMetrics: [\n        new ExactMatch(),\n        new Contains(),\n    ],\n    experimentName: 'my-evaluation',\n);\n\n// Access results\necho \"Evaluated {$result-\u003ecount()} items in {$result-\u003edurationSeconds}s\\n\";\necho \"Average exact_match: {$result-\u003egetAverageScore('exact_match')}\\n\";\necho \"Average contains: {$result-\u003egetAverageScore('contains')}\\n\";\n\n// Get all average scores\n$averages = $result-\u003egetAverageScores();\nforeach ($averages as $metric =\u003e $score) {\n    echo \"{$metric}: {$score}\\n\";\n}\n```\n\nThe `evaluate()` function:\n- Creates an experiment for tracking results\n- Runs the task function on each dataset item\n- Calculates scores using the provided metrics\n- Logs feedback scores to traces\n- Returns detailed results with averages\n\n---\n\n## API Reference\n\n### OpikClient Methods\n\n| Category | Method | Description |\n|----------|--------|-------------|\n| **Tracing** | `trace(...)` | Create a trace |\n| | `span(...)` | Create a standalone span |\n| | `searchTraces(...)` | Search traces with OQL |\n| | `searchSpans(...)` | Search spans with OQL |\n| | `getTraceContent(id)` | Get trace by ID |\n| | `getSpanContent(id)` | Get span by ID |\n| **Feedback** | `logTracesFeedbackScores(scores)` | Batch log trace scores |\n| | `logSpansFeedbackScores(scores)` | Batch log span scores |\n| | `logThreadsFeedbackScores(scores)` | Batch log thread scores |\n| | `deleteTraceFeedbackScore(id, name)` | Delete trace score |\n| | `deleteSpanFeedbackScore(id, name)` | Delete span score |\n| **Threads** | `closeThread(id)` | Close a thread |\n| | `closeThreads(ids)` | Close multiple threads |\n| **Datasets** | `getDataset(name)` | Get dataset |\n| | `getDatasets()` | List datasets |\n| | `createDataset(name)` | Create dataset |\n| | `getOrCreateDataset(name)` | Get or create dataset |\n| | `deleteDataset(name)` | Delete dataset |\n| **Experiments** | `createExperiment(name, datasetName)` | Create experiment |\n| | `getExperiment(name)` | Get by name |\n| | `getExperimentById(id)` | Get by ID |\n| | `updateExperiment(id, ...)` | Update experiment |\n| | `deleteExperiment(name)` | Delete experiment |\n| **Prompts** | `createPrompt(name, template)` | Create text or chat prompt |\n| | `getPrompt(name)` | Get prompt |\n| | `getPrompts()` | List prompts |\n| | `getPromptHistory(name)` | Get versions |\n| | `deletePrompts(ids)` | Delete prompts |\n| **Attachments** | `getAttachmentClient()` | Get attachment client |\n| **Evaluation** | `evaluate(dataset, task, ...)` | Run evaluation with metrics |\n| **Utilities** | `authCheck()` | Verify credentials |\n| | `flush()` | Send pending data |\n| | `getConfig()` | Get configuration |\n| | `getProjectUrl()` | Get project URL |\n\n### Trace Methods\n\n| Method | Description |\n|--------|-------------|\n| `span(name, type?, ...)` | Create child span |\n| `update(output?, ...)` | Update trace data |\n| `end()` | End the trace |\n| `logFeedbackScore(name, value, ...)` | Log feedback score |\n| `getId()` | Get trace ID |\n\n### Span Methods\n\n| Method | Description |\n|--------|-------------|\n| `span(name, type?, ...)` | Create child span |\n| `update(output?, model?, usage?, ...)` | Update span data |\n| `end()` | End the span |\n| `logFeedbackScore(name, value, ...)` | Log feedback score |\n| `getId()` | Get span ID |\n\n---\n\n## Development\n\n```bash\n# Install dependencies\ncomposer install\n\n# Run tests\ncomposer test\n\n# Run with coverage (requires pcov/xdebug)\ncomposer test:coverage\n\n# Static analysis\ncomposer analyse\n\n# Code formatting\ncomposer format\ncomposer format:check\n```\n\n---\n\n## License\n\nMIT\n\n## Trademarks\n\nOpik and Comet ML are trademarks of Comet ML, Inc. This project is not affiliated with, endorsed by, or sponsored by Comet ML, Inc.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fklipitkas%2Fopik-php","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fklipitkas%2Fopik-php","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fklipitkas%2Fopik-php/lists"}