Projects in Awesome Lists tagged with llm-as-a-judge | Ecosyste.ms: Awesome

Projects in Awesome Lists tagged with llm-as-a-judge

A curated list of projects in awesome lists tagged with llm-as-a-judge .

- Recently synced
- Stars

https://github.com/agenta-ai/agenta

The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.

llm-as-a-judge llm-evaluation llm-framework llm-monitoring llm-observability llm-platform llm-playground llm-tools llmops-platform prompt-engineering prompt-management rag-evaluation

Last synced: 12 May 2025

https://github.com/prometheus-eval/prometheus-eval

Evaluate your LLM's response with Prometheus and GPT4 💯

evaluation gpt4 litellm llm llm-as-a-judge llm-as-evaluator llmops python vllm

Last synced: 05 Apr 2025

https://github.com/iaar-shanghai/xfinder

[ICLR 2025] xFinder: Large Language Models as Automated Evaluators for Reliable Evaluation

benchmark cc-by-nc-nd-4 chatglm dataset evaluation gpt judge-model key-answer-extraction large-language-models llm llm-as-a-judge llm-as-evaluator lm-evaluation open-compass phi qwen regex reliability reliable-evaluation xfinder

Last synced: 06 Apr 2025

https://github.com/iaar-shanghai/xverify

xVerify: Efficient Answer Verifier for Large Language Model Evaluations

benchmark cc-by-nc-nd-4 chatgpt deepseek-math evaluation judge-model llm llm-as-a-judge math-verify open-compass open-r1 reasoning-models regex reliability reliability-tools xverify

Last synced: 14 Apr 2025

https://github.com/root-signals/root-signals-mcp

MCP for Root Signals Evaluation Platform

agentic-ai evals llm-as-a-judge mcp model-context-protocol pydantic-ai

Last synced: 03 May 2025