Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/wellecks/naturalproofs
NaturalProofs: Mathematical Theorem Proving in Natural Language (NeurIPS 2021 Datasets & Benchmarks)
https://github.com/wellecks/naturalproofs
Last synced: 2 months ago
JSON representation
NaturalProofs: Mathematical Theorem Proving in Natural Language (NeurIPS 2021 Datasets & Benchmarks)
- Host: GitHub
- URL: https://github.com/wellecks/naturalproofs
- Owner: wellecks
- License: mit
- Created: 2021-03-10T23:14:50.000Z (almost 4 years ago)
- Default Branch: master
- Last Pushed: 2022-09-08T09:23:55.000Z (over 2 years ago)
- Last Synced: 2024-08-01T16:39:41.360Z (5 months ago)
- Language: Python
- Homepage:
- Size: 4.88 MB
- Stars: 112
- Watchers: 7
- Forks: 9
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
## NaturalProofs: Mathematical Theorem Proving in Natural Language
[NaturalProofs: Mathematical Theorem Proving in Natural Language](https://cs.nyu.edu/~welleck/welleck2021naturalproofs.pdf)\
Sean Welleck, Jiacheng Liu, Ronan Le Bras, Hannaneh Hajishirzi, Yejin Choi, Kyunghyun Cho[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.4632538.svg)](https://doi.org/10.5281/zenodo.4632538)
This repo contains:
- The **NaturalProofs Dataset**
- **Tokenized task data** for mathematical reference retrieval and generation.
- **Preprocessing** NaturalProofs and the task data.
- **Training** and **evaluation** for mathematical reference retrieval and generation.
- **Pretrained models** for mathematical reference retrieval and generation.Please cite our work if you found the resources in this repository useful:
```
@inproceedings{welleck2021naturalproofs,
title={NaturalProofs: Mathematical Theorem Proving in Natural Language},
author={Sean Welleck and Jiacheng Liu and Ronan Le Bras and Hannaneh Hajishirzi and Yejin Choi and Kyunghyun Cho},
booktitle={Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1)},
year={2021},
url={https://openreview.net/forum?id=Jvxa8adr3iY}
}
```## Quick download
To download and unpack NaturalProofs, use:
```
pip install gdown
python download.py --naturalproofs --savedir /path/to/savedir
```To download and unpack all files that we describe below, use:
```
python download.py --naturalproofs --tokenized --checkpoint --other --savedir /path/to/savedir
```
This creates the following file structure:
```
{savedir}/data # contains NaturalProofs base data (.json files) and tokenized task data (.pkl files)
{savedir}/ckpt # contains pretrained model checkpoints
{savedir}/other # contains precomputed files for evaluation (ref encodings, etc.)
```## NaturalProofs Dataset
We provide the NaturalProofs Dataset (JSON per domain):| NaturalProofs Dataset [[zenodo](https://doi.org/10.5281/zenodo.4632538)]| Domain|
|-|-|
|[naturalproofs_proofwiki.json](https://zenodo.org/record/4902289/files/naturalproofs_proofwiki.json?download=1)|ProofWiki|
|[naturalproofs_stacks.json](https://zenodo.org/record/4902289/files/naturalproofs_stacks.json?download=1)|Stacks|
|[naturalproofs_trench.json](https://zenodo.org/record/4902202/files/naturalproofs_trench.json?download=1)|Real Analysis textbook|
|[naturalproofs_stein.json](https://zenodo.org/record/4902289/files/naturalproofs_stein.py?download=1) (script)|Number Theory textbook|To download NaturalProofs, use:
```
python download.py --naturalproofs --savedir /path/to/savedir
```#### Combined ProofWiki+Stacks
The download includes an extra combined ProofWiki+Stacks file made with [notebooks/merge.ipynb](notebooks/merge.ipynb).#### Preprocessing
To see the steps used to create each domain of NaturalProofs from raw data, see the following notebooks.\
This preprocessing is **not needed** if you are using a preprocessed dataset provided above.
| Domain| |
|-|-|
|ProofWiki|[notebook](notebooks/parse_proofwiki.ipynb)|
|Stacks|[notebook](notebooks/parse_stacks.ipynb)|
|Real Analysis textbook|[notebook](notebooks/parse_textbooks.ipynb)|
|Number Theory textbook|[notebook](notebooks/parse_textbooks.ipynb)|## Mathematical Reference Retrieval and Generation
To use NaturalProofs for the reference retrieval and generation tasks described in the paper, the first step is tokenization.### Tokenized dataset
We tokenize the raw NaturalProofs Dataset into two different formats:
- **Pairwise**: `(x, r)`
- `x` theorem (sequence of tokens)
- `r` reference (sequence of tokens)
- This version is used to train and evaluate the **pairwise model**.
- **Sequence**: `(x, [rid_1, ..., rid_Tx])`
- `x` theorem (sequence of tokens)
- `rid_i` reference id
- This version is used to train and evaluate the **autoregressive** and **joint** models.
We provide the following versions used in the paper (`bert-based-cased` tokenizer):
| Type | Domain| Splits|
|-|-|-|
|Pairwise, `bert-base-cased`|Proofwiki | train,valid,test |
||Stacks| train,valid,test |
||Real Analysis (textbook))| test |
||Number Theory (textbook)| test |
|Sequence, `bert-base-cased`|Proofwiki | train,valid,test |
||Stacks| train,valid,test |
||Real Analysis (textbook)| test |
||Number Theory (textbook)| test |To download and unpack them, use:
```
python download.py --tokenized --savedir /path/to/savedir
```
Or use [google drive link](https://drive.google.com/file/d/1OCIvcCyKTyRJeV7QiHdtQQhPJ6QknMpV/view?usp=sharing).### Pretrained Models
We provide the following models used in the paper:
| Type | |Domain|
|-|-|-|
|**Pairwise**|`bert-base-cased`|Proofwiki | [link]()|
|**Pairwise**|`bert-base-cased`|Stacks| [link]()|
|**Pairwise**|`bert-base-cased`|Proofwiki+Stacks | [link]()|
|**Joint**|`bert-base-cased`|Proofwiki | [link]()|
|**Joint**|`bert-base-cased`|Stacks| [link]()|
|**Joint**|`bert-base-cased`|Proofwiki+Stacks | [link]()|
|**Autoregressive**|`bert-base-cased`|Proofwiki |
|**Autoregressive**|`bert-base-cased`|Stacks| [link]()|To download and unpack them, use:
```
python download.py --checkpoint --savedir /path/to/savedir
```
Or use [google drive link](https://drive.google.com/file/d/1uIBeI7fw5vJBhDOl2WL3SbXWmzHgfK3W/view?usp=sharing).### Creating your own tokenized dataset
This step is **not needed** if you are using a tokenized dataset provided above.\
First, setup the code:
```bash
python setup.py develop
```To create your own tokenized versions:
- **Pairwise**: `python naturalproofs/tokenize_pairwise.py`
- **Sequence**: `python naturalproofs/encoder_decoder/utils.py`## Evaluation
We will show you how to run evaluation on the pretrained model checkpoints & associated files.
### Setup
We will assume the file structure given by using the download script.
```bash
python download.py --naturalproofs --tokenized --checkpoint --other --savedir
```We provide a script which assembles an evaluation command for `(model type, domain, task)` combinations.\
We show example commands below.```bash
python run_analysis.py \
--train-ds-names {proofwiki stacks}+ \ # one or more training domains to choose a model
--eval-ds-names {proofwiki stacks stein trench}+ \ # one or more evaluation domains
--model {pairwise, joint, autoregressive} \
--generation \ # for generation task (autoregressive or joint models only)
--split {valid, test} \
--gpu \
--codedir /path/to/naturalproofs_code \
--datadir /data \
--ckptdir /ckpt \
--outdir /output
```To make sure your filepaths line up, please look inside `run_analysis.py` to see how the `--{}dir` arguments are used.
#### Example: pairwise retrieval
```
python run_analysis.py --train-ds-names proofwiki \
--eval-ds-names proofwiki stein trench \
--model pairwise \
--gpu 1 \
--split test
```#### Example: joint retrieval
```
python run_analysis.py --train-ds-names proofwiki \
--eval-ds-names proofwiki \
--model joint \
--gpu 1 \
--split test
```#### Example: joint retrieval OOD
For OOD evaluation on `stein` and `trench` textbooks, provide
reference embeddings from the pairwise model.\
These are the `__encs.pt` files from running the pairwise retrieval evaluation (we provide an example in `other/`).
```
python run_analysis.py --model joint \
--train-ds-names proofwiki \
--eval-ds-names stein trench \
--stein-rencs /other/pairwise__train_proofwiki__eval_stein__test__encs.pt \
--trench-rencs /other/pairwise__train_proofwiki__eval_trench__test__encs.pt \
--gpu 1 \
--split test
```#### Example: joint retrieval proofwiki+stacks model
To align the model's combined output space with the individual dataset used for evaluation, give a `tok2tok.pkl` map (we provide an example in `other/`):
```
python run_analysis.py --model joint \
--train-ds-names both \
--eval-ds-names proofwiki stacks \
--modeltok2datatok /other/tok2tok.pkl \
--gpu 1 \
--split test
```
Note that OOD evaluation (`stein` or `trench`) is not implemented for the combined model.#### Example: autoregressive retrieval
Without the `--generation` flag, adjusts settings for retrieval evaluation:
```
python run_analysis.py --model autoregressive \
--train-ds-names proofwiki \
--eval-ds-names proofwiki \
--gpu 1 \
--split valid
```Note that OOD evaluation (`stein` or `trench`) is not implemented for the autoregressive model.
#### Example: autoregressive generation
```
python run_analysis.py --model autoregressive --generation \
--train-ds-names proofwiki \
--eval-ds-names proofwiki \
--gpu 1 \
--split valid
```## Training
The provided code supports:
- Training a **pairwise** model
- Training an **autoregressive** or **joint** model, initialized with pairwise model components (parameters, reference embeddings)#### Training a pairwise model
```bash
python naturalproofs/model.py --expr-name pairwise \
--datapath /path/to/_tokenized__bert-base-cased.pkl \
--default-root-dir /path/to/output
```#### Training a joint model
The joint (and autoregressive) model uses a pairwise checkpoint, and reference encodings for initialization.- The pairwise checkpoint is saved during pairwise training.
- The reference encodings are saved in a `encs.pt` file during pairwise Evaluation.```bash
python naturalproofs/encoder_decoder/model.py \
--datapath /path/to/sequence__tokenized__bert-base-cased.pkl \
--default-root-dir /path/to/output
--pretrained-retrieval-checkpoint /path/to/pairwise__.ckpt \
--encs-file /path/to/train___eval___valid__encs.pt \ # obtained from running evaluation on trained pairwise model
--parallel 1 \
--set-mode 1 # discard duplicates
```Our implementation uses the same encoder-decoder architecture for the autoregressive and joint models,
considering the joint model as a one-step special case (with KL-div loss).
See the Appendix for a discussion on this design decision and technical details.#### Training an autoregressive model
```bash
python naturalproofs/encoder_decoder/model.py \
--datapath /path/to/sequence__tokenized__bert-base-cased.pkl \
--default-root-dir /path/to/output
--pretrained-retrieval-checkpoint /path/to/pairwise__.ckpt \
--encs-file /path/to/train___eval___valid__encs.pt \ # obtained from running evaluation on trained pairwise model
--parallel 0 \
--set-mode 0 # keep duplicates
```### Non-neural baselines
TF-IDF example:
```bash
python naturalproofs/baselines.py \
--method tfidf \
--datapath /path/to/_tokenized__bert-base-cased_200.pkl \
--datapath-base /path/to/naturalproofs_.json \
--savedir /path/to/out/==> /path/to/out/tfidf__eval.pkl
```
Then use `analyze.py` to compute metrics:
```bash
python naturalproofs/analyze.py \
--method tfidf \
--eval-path /path/to/out/tfidf__eval.pkl \
--datapath-base /path/to/naturalproofs_.json==> /path/to/out/tfidf__analysis.pkl
```