{"id":28653899,"url":"https://github.com/tiger-ai-lab/theoremqa","last_synced_at":"2025-09-06T00:44:00.035Z","repository":{"id":232715194,"uuid":"785019599","full_name":"TIGER-AI-Lab/TheoremQA","owner":"TIGER-AI-Lab","description":"The official repo for \"TheoremQA: A Theorem-driven Question Answering dataset\" (EMNLP 2023)","archived":false,"fork":false,"pushed_at":"2024-05-15T13:39:12.000Z","size":2063,"stargazers_count":31,"open_issues_count":0,"forks_count":3,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-06-13T07:08:04.317Z","etag":null,"topics":["lm","math","theorem"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2305.12524","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/TIGER-AI-Lab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-04-11T02:54:54.000Z","updated_at":"2025-06-09T01:34:27.000Z","dependencies_parsed_at":"2024-04-24T20:43:04.748Z","dependency_job_id":"3157b292-75c9-4738-acd0-1a5bb69d4e67","html_url":"https://github.com/TIGER-AI-Lab/TheoremQA","commit_stats":null,"previous_names":["tiger-ai-lab/theoremqa"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/TIGER-AI-Lab/TheoremQA","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TIGER-AI-Lab%2FTheoremQA","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TIGER-AI-Lab%2FTheoremQA/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TIGER-AI-Lab%2FTheoremQA/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TIGER-AI-Lab%2FTheoremQA/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/TIGER-AI-Lab","download_url":"https://codeload.github.com/TIGER-AI-Lab/TheoremQA/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TIGER-AI-Lab%2FTheoremQA/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273842866,"owners_count":25177921,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-05T02:00:09.113Z","response_time":402,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["lm","math","theorem"],"created_at":"2025-06-13T07:08:03.485Z","updated_at":"2025-09-06T00:43:59.957Z","avatar_url":"https://github.com/TIGER-AI-Lab.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# TheoremQA\nThe official repo for [TheoremQA: A Theorem-driven Question Answering dataset](https://arxiv.org/abs/2305.12524) (EMNLP 2023)\n\nThe leaderboard is displayed in https://huggingface.co/spaces/TIGER-Lab/Science-Leaderboard\n\n## Introduction\nWe propose the first question-answering dataset driven by STEM theorems. We annotated 800 QA pairs covering 350+ theorems spanning across Math, EE\u0026CS, Physics and Finance. The dataset is collected by human experts with very high quality. We provide the dataset as a new benchmark to test the limit of large language models to apply theorems to solve challenging university-level questions. We provide a pipeline in the following to prompt LLMs and evaluate their outputs with WolframAlpha.\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"overview.001.jpeg\" width=\"1000\"\u003e\n\u003c/p\u003e\n\nThe dataset covers a wide range of topics listed below:\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"fields.png\" width=\"700\"\u003e\n\u003c/p\u003e\n\n## Examples\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"examples.001.jpeg\" width=\"400\"\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"examples.002.jpeg\" width=\"400\"\u003e\n\u003c/p\u003e\n\n## Huggingface\nOur dataset is on Huggingface now: https://huggingface.co/datasets/TIGER-Lab/TheoremQA\n```\nfrom datasets import load_dataset\ndataset = load_dataset(\"wenhu/TheoremQA\")\n```\n\n## Running Instruction (5-shot ICL)\n```\nmkdir outputs\npython run.py --model [YOUR_MODEL_HF_LINK] --form short\n```\n\n\n## Cite our Work\n```\n@inproceedings{chen2023theoremqa,\n  title={Theoremqa: A theorem-driven question answering dataset},\n  author={Chen, Wenhu and Yin, Ming and Ku, Max and Lu, Pan and Wan, Yixin and Ma, Xueguang and Xu, Jianyu and Wang, Xinyi and Xia, Tony},\n  booktitle={The 2023 Conference on Empirical Methods in Natural Language Processing},\n  year={2023}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftiger-ai-lab%2Ftheoremqa","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftiger-ai-lab%2Ftheoremqa","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftiger-ai-lab%2Ftheoremqa/lists"}