Projects in Awesome Lists tagged with evaluation-as-code
A curated list of projects in awesome lists tagged with evaluation-as-code .
https://github.com/lizhiyao/oh-my-knowledge
Evaluation framework for LLM knowledge inputs — prompts, RAG corpora, skills, agent workflows. Fix the model, vary the artifact. Built-in statistical rigor: bootstrap CI, Krippendorff α, length-debias, saturation curves.
agent-evaluation ai benchmark bootstrap-ci claude claude-code evaluation-as-code evaluation-framework knowledge-engineering krippendorff-alpha llm llm-evaluation llm-judge multi-judge-ensemble prompt-engineering prompt-testing rag-evaluation skill-evaluation
Last synced: 15 Jun 2026