https://github.com/auraoneai/evalkit-action

GitHub Action for running EvalKit validation, scoring, and reporting in CI.
https://github.com/auraoneai/evalkit-action

ai-evaluation evalkit evals github-actions

Last synced: 6 days ago
JSON representation

GitHub Action for running EvalKit validation, scoring, and reporting in CI.

Host: GitHub
URL: https://github.com/auraoneai/evalkit-action
Owner: auraoneai
License: mit
Created: 2026-05-12T01:33:45.000Z (22 days ago)
Default Branch: main
Last Pushed: 2026-05-12T04:38:03.000Z (22 days ago)
Last Synced: 2026-05-12T05:39:02.292Z (22 days ago)
Topics: ai-evaluation, evalkit, evals, github-actions
Language: TypeScript
Homepage: https://auraone.ai/open
Size: 12.7 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md

Awesome Lists containing this project

README

# evalkit-action

Eval-as-CI for AuraOne EvalKit. The action accepts `rubric-path`, `responses-path`, `judge-config`, and `threshold`, installs `auraone-evalkit`, runs score/report commands, and can fail checks below threshold.

## What This Is Not

Examples contain no paid or customer data.

## Example

```yaml
name: evalkit
on: [pull_request]
jobs:
eval:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: auraoneai/evalkit-action@v0.1.1
with:
rubric-path: evals/rubric.jsonl
responses-path: evals/model_outputs.jsonl
threshold: "0.75"
github-token: ${{ secrets.GITHUB_TOKEN }}
```

The action installs `auraone-evalkit`, writes report-ready score JSON, generates a Markdown report, comments on pull requests when a token and PR context are available, and fails the check when the average score is below `threshold`.
`judge-config` must be a JSON object. The action validates it, writes it to a temporary file, and exposes it to EvalKit subprocesses as `EVALKIT_JUDGE_CONFIG` and `EVALKIT_JUDGE_CONFIG_PATH`.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/auraoneai/evalkit-action

Awesome Lists containing this project

README