Projects in Awesome Lists tagged with agent-benchmark

A curated list of projects in awesome lists tagged with agent-benchmark .

- Recently synced
- Stars

https://github.com/hidai25/eval-view

Catch AI agent regressions before you ship. YAML test cases, golden baselines, execution tracing, cost tracking, CI integration. LangGraph, CrewAI, Anthropic, OpenAI.

agent agent-benchmark agent-evaluation agentic-ai ai-agents anthropic crewai crewai-tools evaluation langchain langgraph langgraph-python llm llmops mlops openai-assistants pytest testing tools

Last synced: 09 Mar 2026

https://github.com/grnbtqdbyx-create/trace-to-skill

Check whether a repo is Codex-ready, then turn failed AI coding-agent runs into reusable AGENTS.md rules, skills, and eval gates.

agent-benchmark agent-evals agent-skills agent-workflows agents-md agents-md-linter ai-agents ai-code-review ai-coding-agents claude-code codex codex-cli codex-readiness evals github-action mcp mcp-security open-source-maintainers openai-codex prompt-injection

Last synced: 31 May 2026

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

Projects in Awesome Lists tagged with agent-benchmark

https://github.com/hidai25/eval-view

https://github.com/grnbtqdbyx-create/trace-to-skill