Projects in Awesome Lists tagged with reliable-evaluation
A curated list of projects in awesome lists tagged with reliable-evaluation .
https://github.com/iaar-shanghai/xfinder
[ICLR 2025] xFinder: Large Language Models as Automated Evaluators for Reliable Evaluation
benchmark cc-by-nc-nd-4 chatglm dataset evaluation gpt judge-model key-answer-extraction large-language-models llm llm-as-a-judge llm-as-evaluator lm-evaluation open-compass phi qwen regex reliability reliable-evaluation xfinder
Last synced: 06 Apr 2025