https://github.com/auraoneai/evalkit-action
GitHub Action for running EvalKit validation, scoring, and reporting in CI.
https://github.com/auraoneai/evalkit-action
ai-evaluation evalkit evals github-actions
Last synced: 6 days ago
JSON representation
GitHub Action for running EvalKit validation, scoring, and reporting in CI.
- Host: GitHub
- URL: https://github.com/auraoneai/evalkit-action
- Owner: auraoneai
- License: mit
- Created: 2026-05-12T01:33:45.000Z (22 days ago)
- Default Branch: main
- Last Pushed: 2026-05-12T04:38:03.000Z (22 days ago)
- Last Synced: 2026-05-12T05:39:02.292Z (22 days ago)
- Topics: ai-evaluation, evalkit, evals, github-actions
- Language: TypeScript
- Homepage: https://auraone.ai/open
- Size: 12.7 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
Awesome Lists containing this project
README
# evalkit-action
Eval-as-CI for AuraOne EvalKit. The action accepts `rubric-path`, `responses-path`, `judge-config`, and `threshold`, installs `auraone-evalkit`, runs score/report commands, and can fail checks below threshold.
## What This Is Not
Examples contain no paid or customer data.
## Example
```yaml
name: evalkit
on: [pull_request]
jobs:
eval:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: auraoneai/evalkit-action@v0.1.1
with:
rubric-path: evals/rubric.jsonl
responses-path: evals/model_outputs.jsonl
threshold: "0.75"
github-token: ${{ secrets.GITHUB_TOKEN }}
```
The action installs `auraone-evalkit`, writes report-ready score JSON, generates a Markdown report, comments on pull requests when a token and PR context are available, and fails the check when the average score is below `threshold`.
`judge-config` must be a JSON object. The action validates it, writes it to a temporary file, and exposes it to EvalKit subprocesses as `EVALKIT_JUDGE_CONFIG` and `EVALKIT_JUDGE_CONFIG_PATH`.