https://github.com/surus-lat/benchy
A benchmarking engine for evaluating AI systems on task-specific performance.
https://github.com/surus-lat/benchy
ai benchmarks engine evals
Last synced: about 2 months ago
JSON representation
A benchmarking engine for evaluating AI systems on task-specific performance.
- Host: GitHub
- URL: https://github.com/surus-lat/benchy
- Owner: surus-lat
- Created: 2025-09-12T18:04:39.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2026-04-02T22:46:20.000Z (2 months ago)
- Last Synced: 2026-04-03T06:56:45.748Z (2 months ago)
- Topics: ai, benchmarks, engine, evals
- Language: Python
- Homepage: https://benchy.lat/
- Size: 3.19 MB
- Stars: 4
- Watchers: 0
- Forks: 0
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- Agents: AGENTS.md