Projects in Awesome Lists tagged with agent-benchmark
A curated list of projects in awesome lists tagged with agent-benchmark .
https://github.com/hidai25/eval-view
Catch AI agent regressions before you ship. YAML test cases, golden baselines, execution tracing, cost tracking, CI integration. LangGraph, CrewAI, Anthropic, OpenAI.
agent agent-benchmark agent-evaluation agentic-ai ai-agents anthropic crewai crewai-tools evaluation langchain langgraph langgraph-python llm llmops mlops openai-assistants pytest testing tools
Last synced: 09 Mar 2026
https://github.com/grnbtqdbyx-create/trace-to-skill
Check whether a repo is Codex-ready, then turn failed AI coding-agent runs into reusable AGENTS.md rules, skills, and eval gates.
agent-benchmark agent-evals agent-skills agent-workflows agents-md agents-md-linter ai-agents ai-code-review ai-coding-agents claude-code codex codex-cli codex-readiness evals github-action mcp mcp-security open-source-maintainers openai-codex prompt-injection
Last synced: 31 May 2026