https://github.com/tiger-ai-lab/theoremqa

The official repo for "TheoremQA: A Theorem-driven Question Answering dataset" (EMNLP 2023)
https://github.com/tiger-ai-lab/theoremqa

lm math theorem

Last synced: about 1 month ago
JSON representation

The official repo for "TheoremQA: A Theorem-driven Question Answering dataset" (EMNLP 2023)

Host: GitHub
URL: https://github.com/tiger-ai-lab/theoremqa
Owner: TIGER-AI-Lab
License: mit
Created: 2024-04-11T02:54:54.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-05-15T13:39:12.000Z (over 1 year ago)
Last Synced: 2025-06-13T07:08:04.317Z (4 months ago)
Topics: lm, math, theorem
Language: Python
Homepage: https://arxiv.org/abs/2305.12524
Size: 1.97 MB
Stars: 31
Watchers: 2
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# TheoremQA
The official repo for [TheoremQA: A Theorem-driven Question Answering dataset](https://arxiv.org/abs/2305.12524) (EMNLP 2023)

The leaderboard is displayed in https://huggingface.co/spaces/TIGER-Lab/Science-Leaderboard

## Introduction
We propose the first question-answering dataset driven by STEM theorems. We annotated 800 QA pairs covering 350+ theorems spanning across Math, EE&CS, Physics and Finance. The dataset is collected by human experts with very high quality. We provide the dataset as a new benchmark to test the limit of large language models to apply theorems to solve challenging university-level questions. We provide a pipeline in the following to prompt LLMs and evaluate their outputs with WolframAlpha.

The dataset covers a wide range of topics listed below:

## Examples

## Huggingface
Our dataset is on Huggingface now: https://huggingface.co/datasets/TIGER-Lab/TheoremQA
```
from datasets import load_dataset
dataset = load_dataset("wenhu/TheoremQA")
```

## Running Instruction (5-shot ICL)
```
mkdir outputs
python run.py --model [YOUR_MODEL_HF_LINK] --form short
```

## Cite our Work
```
@inproceedings{chen2023theoremqa,
title={Theoremqa: A theorem-driven question answering dataset},
author={Chen, Wenhu and Yin, Ming and Ku, Max and Lu, Pan and Wan, Yixin and Ma, Xueguang and Xu, Jianyu and Wang, Xinyi and Xia, Tony},
booktitle={The 2023 Conference on Empirical Methods in Natural Language Processing},
year={2023}
}
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/tiger-ai-lab/theoremqa

Awesome Lists containing this project

README