An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with evaluation-tools

A curated list of projects in awesome lists tagged with evaluation-tools .

https://github.com/jetbrains/teamcity-ai-agent-testing-demo

End-to-end TeamCity framework to run AI agents on SWE-Bench Lite. Spin up isolated Docker images per task, extract patches, score with the official harness, and aggregate success rates. As an example, we'll look at Junie and Google Gemini CLI

agent-evaluation agentic-ai ai eval evaluation evaluation-framework evaluation-tools

Last synced: 18 Apr 2026