Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/TIGER-AI-Lab/MMLU-Pro
The scripts for MMLU-Pro
https://github.com/TIGER-AI-Lab/MMLU-Pro
Last synced: about 2 months ago
JSON representation
The scripts for MMLU-Pro
- Host: GitHub
- URL: https://github.com/TIGER-AI-Lab/MMLU-Pro
- Owner: TIGER-AI-Lab
- License: apache-2.0
- Created: 2024-05-16T23:59:51.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-07-30T05:11:35.000Z (about 2 months ago)
- Last Synced: 2024-07-31T08:23:34.259Z (about 2 months ago)
- Language: Python
- Size: 107 MB
- Stars: 54
- Watchers: 1
- Forks: 8
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-llm-eval - MMLU-Pro - Pro 是 MMLU 数据集的改进版本。MMLU 一直是多选知识数据集的参考。然而,最近的研究表明它既包含噪音(一些问题无法回答),又太容易(通过模型能力的进化和污染的增加)。MMLU-Pro 向模型提供十个选择而不是四个,要求在更多问题上进行推理,并经过专家审查以减少噪音量。它比原版质量更高且更难. MMLU-Pro减少了提示变化对模型性能的影响,这是其前身MMLU常见的问题。研究发现,在这个新基准上使用“Chain of Thought”推理的模型表现更好,这表明MMLU-Pro更适合评估人工智能的微妙推理能力. (2024-05-20) | (Datasets-or-Benchmark / 通用)