https://github.com/fraware/rust-evals

Evaluate existing candidate patches; built to make benchmark claims auditable, reproducible, and explicitly evaluator-conditioned
https://github.com/fraware/rust-evals

artifact-evaluation benchmarking coding-agents evaluation formal-methods llm python reproducibility rust swe-bench

Last synced: 8 days ago
JSON representation

Evaluate existing candidate patches; built to make benchmark claims auditable, reproducible, and explicitly evaluator-conditioned

Host: GitHub
URL: https://github.com/fraware/rust-evals
Owner: fraware
License: other
Created: 2026-04-21T18:09:04.000Z (about 2 months ago)
Default Branch: main
Last Pushed: 2026-05-01T13:25:29.000Z (about 2 months ago)
Last Synced: 2026-05-01T15:17:27.809Z (about 2 months ago)
Topics: artifact-evaluation, benchmarking, coding-agents, evaluation, formal-methods, llm, python, reproducibility, rust, swe-bench
Language: Python
Homepage:
Size: 198 MB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/fraware/rust-evals

Awesome Lists containing this project