https://github.com/TIGER-AI-Lab/MMLU-Pro
The code and data for "MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark" [NeurIPS 2024]
https://github.com/TIGER-AI-Lab/MMLU-Pro
evaluation llm
Last synced: 5 months ago
JSON representation
The code and data for "MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark" [NeurIPS 2024]
- Host: GitHub
- URL: https://github.com/TIGER-AI-Lab/MMLU-Pro
- Owner: TIGER-AI-Lab
- License: apache-2.0
- Created: 2024-05-16T23:59:51.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-28T17:39:35.000Z (7 months ago)
- Last Synced: 2025-02-28T21:57:39.707Z (7 months ago)
- Topics: evaluation, llm
- Language: Python
- Homepage:
- Size: 325 MB
- Stars: 193
- Watchers: 6
- Forks: 33
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-llm-eval - link