{"id":48976084,"url":"https://github.com/synapsekit/evalci","last_synced_at":"2026-04-18T09:04:45.961Z","repository":{"id":350298223,"uuid":"1206217565","full_name":"SynapseKit/evalci","owner":"SynapseKit","description":"LLM quality gates for every PR — run @eval_case suites automatically and block merge if quality drops below threshold","archived":false,"fork":false,"pushed_at":"2026-04-09T18:50:24.000Z","size":20,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-09T19:38:56.937Z","etag":null,"topics":["ai","ci","eval","github-actions","llm","llmops","machine-learning","quality-assurance","synapsekit","testing"],"latest_commit_sha":null,"homepage":"https://synapsekit.github.io/synapsekit-docs/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SynapseKit.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":["SynapseKit"]}},"created_at":"2026-04-09T17:37:26.000Z","updated_at":"2026-04-09T18:50:33.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/SynapseKit/evalci","commit_stats":null,"previous_names":["synapsekit/evalci"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/SynapseKit/evalci","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SynapseKit%2Fevalci","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SynapseKit%2Fevalci/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SynapseKit%2Fevalci/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SynapseKit%2Fevalci/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SynapseKit","download_url":"https://codeload.github.com/SynapseKit/evalci/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SynapseKit%2Fevalci/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31962892,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-18T00:39:45.007Z","status":"online","status_checked_at":"2026-04-18T02:00:07.018Z","response_time":103,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","ci","eval","github-actions","llm","llmops","machine-learning","quality-assurance","synapsekit","testing"],"created_at":"2026-04-18T09:04:44.967Z","updated_at":"2026-04-18T09:04:45.953Z","avatar_url":"https://github.com/SynapseKit.png","language":"Python","funding_links":["https://github.com/sponsors/SynapseKit"],"categories":[],"sub_categories":[],"readme":"# EvalCI by SynapseKit\n\n[![GitHub Marketplace](https://img.shields.io/badge/Marketplace-EvalCI-blue?logo=github)](https://github.com/marketplace/actions/evalci-by-synapsekit)\n[![License](https://img.shields.io/badge/License-Apache_2.0-orange.svg)](https://github.com/SynapseKit/evalci/blob/main/LICENSE)\n[![Latest Release](https://img.shields.io/github/v/release/SynapseKit/evalci?label=release\u0026color=orange)](https://github.com/SynapseKit/evalci/releases/latest)\n[![GitHub Stars](https://img.shields.io/github/stars/SynapseKit/evalci?style=flat\u0026color=orange)](https://github.com/SynapseKit/evalci/stargazers)\n[![Docs](https://img.shields.io/badge/docs-synapsekit-orange)](https://synapsekit.github.io/synapsekit-docs/docs/evalci/overview)\n[![Discussions](https://img.shields.io/github/discussions/SynapseKit/evalci)](https://github.com/SynapseKit/evalci/discussions)\n[![Issues](https://img.shields.io/github/issues/SynapseKit/evalci)](https://github.com/SynapseKit/evalci/issues)\n\n**LLM quality gates for every PR.** Run your `@eval_case` suites automatically and block merge if quality drops below threshold.\n\n- Zero infrastructure — runs entirely in GitHub Actions\n- 2-minute setup\n- Works with any LLM provider (OpenAI, Anthropic, Gemini, and [30+ more](https://synapsekit.github.io/synapsekit-docs/))\n- Posts a formatted results table as a PR comment\n- Sets Action outputs for downstream steps\n\n---\n\n## Quickstart\n\nAdd `.github/workflows/eval.yml` to your repo:\n\n```yaml\nname: EvalCI\n\non:\n  pull_request:\n\njobs:\n  eval:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v4\n      - uses: SynapseKit/evalci@v1\n        with:\n          path: tests/evals\n          threshold: \"0.80\"\n        env:\n          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}\n```\n\nThat's it. EvalCI will:\n1. Install `synapsekit` into the runner\n2. Discover and run all `@eval_case`-decorated functions under `tests/evals/`\n3. Post a results table as a PR comment\n4. Fail the check if any case scores below threshold\n\n---\n\n## Example eval file\n\n```python\n# tests/evals/test_rag.py\nfrom synapsekit.testing import eval_case\n\n@eval_case(min_score=0.80, max_cost_usd=0.01, max_latency_ms=3000)\ndef test_rag_relevancy(eval_context):\n    result = my_rag_pipeline(\"What is SynapseKit?\")\n    return eval_context.score_relevancy(result, reference=\"SynapseKit is a Python library...\")\n\n@eval_case(min_score=0.75)\ndef test_rag_faithfulness(eval_context):\n    result = my_rag_pipeline(\"How do I install SynapseKit?\")\n    return eval_context.score_faithfulness(result, context=retrieved_docs)\n```\n\n---\n\n## PR Comment\n\nEvalCI posts a comment like this on every PR:\n\n\u003e ## EvalCI Results\n\u003e\n\u003e | | Test | Score | Cost | Latency |\n\u003e |---|---|---|---|---|\n\u003e | ✅ | test_rag_relevancy | 0.850 | $0.0050 | 1200ms |\n\u003e | ❌ | test_rag_faithfulness | 0.650 | $0.0120 | 2500ms |\n\u003e\n\u003e **1/2 passed** · Threshold: `0.80` · [SynapseKit EvalCI](https://synapsekit.github.io/synapsekit-docs/)\n\n---\n\n## Inputs\n\n| Input | Description | Default |\n|---|---|---|\n| `path` | Path to eval files or directory | `.` |\n| `threshold` | Global minimum score (0.0–1.0) | `0.7` |\n| `extras` | pip extras for synapsekit (e.g. `openai,anthropic`) | `openai` |\n| `synapsekit-version` | synapsekit version to install, or `latest` | `latest` |\n| `github-token` | Token for posting PR comments | `${{ github.token }}` |\n| `fail-on-regression` | Fail if score regresses vs. baseline | `false` |\n| `token` | EvalCI backend API token _(future)_ | — |\n\n## Outputs\n\n| Output | Description |\n|---|---|\n| `passed` | Number of eval cases that passed |\n| `failed` | Number of eval cases that failed |\n| `total` | Total number of eval cases run |\n| `mean-score` | Mean score across all eval cases |\n\n---\n\n## Using outputs in downstream steps\n\n```yaml\n- uses: SynapseKit/evalci@v1\n  id: eval\n  with:\n    path: tests/evals\n- run: |\n    echo \"Passed: ${{ steps.eval.outputs.passed }}/${{ steps.eval.outputs.total }}\"\n    echo \"Mean score: ${{ steps.eval.outputs.mean-score }}\"\n```\n\n---\n\n## Multiple providers\n\n```yaml\n- uses: SynapseKit/evalci@v1\n  with:\n    extras: \"openai,anthropic\"\n    threshold: \"0.75\"\n  env:\n    OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}\n    ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}\n```\n\n---\n\n## Badge\n\n```markdown\n[![EvalCI](https://github.com/{owner}/{repo}/actions/workflows/eval.yml/badge.svg)](https://github.com/{owner}/{repo}/actions/workflows/eval.yml)\n```\n\n---\n\n## Documentation\n\nFull documentation is available at **[synapsekit.github.io/synapsekit-docs/docs/evalci/overview](https://synapsekit.github.io/synapsekit-docs/docs/evalci/overview)**\n\n| | |\n|---|---|\n| [Overview](https://synapsekit.github.io/synapsekit-docs/docs/evalci/overview) | What EvalCI is and how it works |\n| [Quickstart](https://synapsekit.github.io/synapsekit-docs/docs/evalci/quickstart) | Set up in 5 minutes |\n| [Writing eval cases](https://synapsekit.github.io/synapsekit-docs/docs/evalci/writing-evals) | How to write `@eval_case` functions |\n| [Action reference](https://synapsekit.github.io/synapsekit-docs/docs/evalci/action-reference) | All inputs, outputs, and configuration |\n| [Examples](https://synapsekit.github.io/synapsekit-docs/docs/evalci/examples) | RAG, agents, multi-provider workflows |\n\n## About\n\nEvalCI is built on [SynapseKit](https://synapsekit.github.io/synapsekit-docs/) — a Python library for building LLM applications with 30+ provider integrations and a built-in evaluation framework.\n\n- [Documentation](https://synapsekit.github.io/synapsekit-docs/docs/evalci/overview)\n- [SynapseKit](https://github.com/SynapseKit/SynapseKit)\n- [Issues](https://github.com/SynapseKit/evalci/issues)\n- [Discussions](https://github.com/SynapseKit/evalci/discussions)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsynapsekit%2Fevalci","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsynapsekit%2Fevalci","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsynapsekit%2Fevalci/lists"}