Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/wellecks/naturalproofs

NaturalProofs: Mathematical Theorem Proving in Natural Language (NeurIPS 2021 Datasets & Benchmarks)
https://github.com/wellecks/naturalproofs
Last synced: 2 months ago
JSON representation
NaturalProofs: Mathematical Theorem Proving in Natural Language (NeurIPS 2021 Datasets & Benchmarks)
Host: GitHub
URL: https://github.com/wellecks/naturalproofs
Owner: wellecks
License: mit
Created: 2021-03-10T23:14:50.000Z (almost 4 years ago)
Default Branch: master
Last Pushed: 2022-09-08T09:23:55.000Z (over 2 years ago)
Last Synced: 2024-08-01T16:39:41.360Z (5 months ago)
Language: Python
Homepage:
Size: 4.88 MB
Stars: 112
Watchers: 7
Forks: 9
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project

README

        ## NaturalProofs: Mathematical Theorem Proving in Natural Language

[NaturalProofs: Mathematical Theorem Proving in Natural Language](https://cs.nyu.edu/~welleck/welleck2021naturalproofs.pdf)\

Sean Welleck, Jiacheng Liu, Ronan Le Bras, Hannaneh Hajishirzi, Yejin Choi, Kyunghyun Cho

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.4632538.svg)](https://doi.org/10.5281/zenodo.4632538)

This repo contains:

- The **NaturalProofs Dataset**

- **Tokenized task data** for mathematical reference retrieval and generation.

- **Preprocessing** NaturalProofs and the task data.

- **Training** and **evaluation** for mathematical reference retrieval and generation.

- **Pretrained models** for mathematical reference retrieval and generation.

Please cite our work if you found the resources in this repository useful:

```

@inproceedings{welleck2021naturalproofs,

  title={NaturalProofs: Mathematical Theorem Proving in Natural Language},

  author={Sean Welleck and Jiacheng Liu and Ronan Le Bras and Hannaneh Hajishirzi and Yejin Choi and Kyunghyun Cho},

  booktitle={Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1)},

  year={2021},

  url={https://openreview.net/forum?id=Jvxa8adr3iY}

}

```

## Quick download

To download and unpack NaturalProofs, use:

```

pip install gdown

python download.py --naturalproofs --savedir /path/to/savedir

```

To download and unpack all files that we describe below, use:

```

python download.py --naturalproofs --tokenized --checkpoint --other --savedir /path/to/savedir

```

This creates the following file structure:

```

{savedir}/data   # contains NaturalProofs base data (.json files) and tokenized task data (.pkl files)

{savedir}/ckpt   # contains pretrained model checkpoints

{savedir}/other  # contains precomputed files for evaluation (ref encodings, etc.)

```

## NaturalProofs Dataset

We provide the NaturalProofs Dataset (JSON per domain):

| NaturalProofs Dataset [[zenodo](https://doi.org/10.5281/zenodo.4632538)]| Domain|

|-|-|

|[naturalproofs_proofwiki.json](https://zenodo.org/record/4902289/files/naturalproofs_proofwiki.json?download=1)|ProofWiki|

|[naturalproofs_stacks.json](https://zenodo.org/record/4902289/files/naturalproofs_stacks.json?download=1)|Stacks|

|[naturalproofs_trench.json](https://zenodo.org/record/4902202/files/naturalproofs_trench.json?download=1)|Real Analysis textbook|

|[naturalproofs_stein.json](https://zenodo.org/record/4902289/files/naturalproofs_stein.py?download=1) (script)|Number Theory textbook|

To download NaturalProofs, use:

```

python download.py --naturalproofs --savedir /path/to/savedir

```

#### Combined ProofWiki+Stacks

The download includes an extra combined ProofWiki+Stacks file  made with [notebooks/merge.ipynb](notebooks/merge.ipynb).

#### Preprocessing

To see the steps used to create each domain of NaturalProofs from raw data, see the following notebooks.\

This preprocessing is **not needed** if you are using a preprocessed dataset provided above.

| Domain| |

|-|-|

|ProofWiki|[notebook](notebooks/parse_proofwiki.ipynb)|

|Stacks|[notebook](notebooks/parse_stacks.ipynb)|

|Real Analysis textbook|[notebook](notebooks/parse_textbooks.ipynb)|

|Number Theory textbook|[notebook](notebooks/parse_textbooks.ipynb)|

## Mathematical Reference Retrieval and Generation

To use NaturalProofs for the reference retrieval and generation tasks described in the paper, the first step is tokenization.

### Tokenized dataset

We tokenize the raw NaturalProofs Dataset into two different formats:

- **Pairwise**: `(x, r)`

    - `x` theorem (sequence of tokens)

    - `r` reference (sequence of tokens)

    - This version is used to train and evaluate the **pairwise model**.

- **Sequence**: `(x, [rid_1, ..., rid_Tx])`

    - `x` theorem (sequence of tokens)

    - `rid_i` reference id

    - This version is used to train and evaluate the **autoregressive** and **joint** models.

  

We provide the following versions used in the paper (`bert-based-cased` tokenizer):

| Type | Domain| Splits|

|-|-|-|

|Pairwise, `bert-base-cased`|Proofwiki | train,valid,test |

||Stacks| train,valid,test |

||Real Analysis (textbook))| test |

||Number Theory (textbook)| test |

|Sequence, `bert-base-cased`|Proofwiki | train,valid,test |

||Stacks| train,valid,test |

||Real Analysis (textbook)| test | 

||Number Theory (textbook)| test | 

To download and unpack them, use:

```

python download.py --tokenized --savedir /path/to/savedir

```

Or use [google drive link](https://drive.google.com/file/d/1OCIvcCyKTyRJeV7QiHdtQQhPJ6QknMpV/view?usp=sharing).

### Pretrained Models

We provide the following models used in the paper:

| Type | |Domain| 

|-|-|-|

|**Pairwise**|`bert-base-cased`|Proofwiki |  [link]()|

|**Pairwise**|`bert-base-cased`|Stacks|  [link]()|

|**Pairwise**|`bert-base-cased`|Proofwiki+Stacks |  [link]()|

|**Joint**|`bert-base-cased`|Proofwiki |  [link]()|

|**Joint**|`bert-base-cased`|Stacks|  [link]()|

|**Joint**|`bert-base-cased`|Proofwiki+Stacks |  [link]()|

|**Autoregressive**|`bert-base-cased`|Proofwiki | 

|**Autoregressive**|`bert-base-cased`|Stacks|  [link]()|

To download and unpack them, use:

```

python download.py --checkpoint --savedir /path/to/savedir

```

Or use [google drive link](https://drive.google.com/file/d/1uIBeI7fw5vJBhDOl2WL3SbXWmzHgfK3W/view?usp=sharing).

### Creating your own tokenized dataset

This step is **not needed** if you are using a tokenized dataset provided above.\

First, setup the code:

```bash

python setup.py develop

```

To create your own tokenized versions:

- **Pairwise**: `python naturalproofs/tokenize_pairwise.py`

- **Sequence**: `python naturalproofs/encoder_decoder/utils.py`

## Evaluation

We will show you how to run evaluation on the pretrained model checkpoints & associated files.

### Setup

We will assume the file structure given by using the download script.

```bash

python download.py --naturalproofs --tokenized --checkpoint --other --savedir 

```

We provide a script which assembles an evaluation command for `(model type, domain, task)` combinations.\

We show example commands below.

```bash

python run_analysis.py \

--train-ds-names {proofwiki stacks}+ \              # one or more training domains to choose a model

--eval-ds-names {proofwiki stacks stein trench}+ \  # one or more evaluation domains

--model {pairwise, joint, autoregressive} \ 

--generation \                                      # for generation task (autoregressive or joint models only)

--split {valid, test} \

--gpu  \

--codedir /path/to/naturalproofs_code \

--datadir /data \

--ckptdir /ckpt \

--outdir /output

```

To make sure your filepaths line up, please look inside `run_analysis.py` to see how the `--{}dir` arguments are used.

#### Example: pairwise retrieval

```

python run_analysis.py --train-ds-names proofwiki \

--eval-ds-names proofwiki stein trench \

--model pairwise \

--gpu 1 \

--split test

```

#### Example: joint retrieval

```

python run_analysis.py --train-ds-names proofwiki \

--eval-ds-names proofwiki \

--model joint \

--gpu 1 \

--split test

```

#### Example: joint retrieval OOD

For OOD evaluation on `stein` and `trench` textbooks, provide 

reference embeddings from the pairwise model.\

These are the `__encs.pt` files from running the pairwise retrieval evaluation (we provide an example in `other/`).

```

python run_analysis.py --model joint \

--train-ds-names proofwiki \

--eval-ds-names stein trench \

--stein-rencs /other/pairwise__train_proofwiki__eval_stein__test__encs.pt \

--trench-rencs /other/pairwise__train_proofwiki__eval_trench__test__encs.pt \

--gpu 1 \

--split test

```

#### Example: joint retrieval proofwiki+stacks model

To align the model's combined output space with the individual dataset used for evaluation, give a `tok2tok.pkl` map (we provide an example in `other/`):

```

python run_analysis.py --model joint \

--train-ds-names both \

--eval-ds-names proofwiki stacks \

--modeltok2datatok /other/tok2tok.pkl \

--gpu 1 \

--split test

```

Note that OOD evaluation (`stein` or `trench`) is not implemented for the combined model.

#### Example: autoregressive retrieval

Without the `--generation` flag, adjusts settings for retrieval evaluation:

```

python run_analysis.py --model autoregressive \

--train-ds-names proofwiki \

--eval-ds-names proofwiki \

--gpu 1 \

--split valid

```

Note that OOD evaluation (`stein` or `trench`) is not implemented for the autoregressive model.

#### Example: autoregressive generation

```

python run_analysis.py --model autoregressive --generation \

--train-ds-names proofwiki \

--eval-ds-names proofwiki \

--gpu 1 \

--split valid

```

## Training

The provided code supports:

- Training a **pairwise** model

- Training an **autoregressive** or **joint** model, initialized with pairwise model components (parameters, reference embeddings)

#### Training a pairwise model

```bash

python naturalproofs/model.py --expr-name pairwise \

--datapath /path/to/_tokenized__bert-base-cased.pkl \

--default-root-dir /path/to/output

```

#### Training a joint model

The joint (and autoregressive) model uses a pairwise checkpoint, and reference encodings for initialization.

- The pairwise checkpoint is saved during pairwise training. 

- The reference encodings are saved in a `encs.pt` file during pairwise Evaluation.

```bash

python naturalproofs/encoder_decoder/model.py \

--datapath /path/to/sequence__tokenized__bert-base-cased.pkl \

--default-root-dir /path/to/output

--pretrained-retrieval-checkpoint /path/to/pairwise__.ckpt \

--encs-file /path/to/train___eval___valid__encs.pt \  # obtained from running evaluation on trained pairwise model 

--parallel 1 \

--set-mode 1   # discard duplicates

```

Our implementation uses the same encoder-decoder architecture for the autoregressive and joint models,

considering the joint model as a one-step special case (with KL-div loss).

See the Appendix for a discussion on this design decision and technical details.

#### Training an autoregressive model

```bash

python naturalproofs/encoder_decoder/model.py \

--datapath /path/to/sequence__tokenized__bert-base-cased.pkl \

--default-root-dir /path/to/output

--pretrained-retrieval-checkpoint /path/to/pairwise__.ckpt \

--encs-file /path/to/train___eval___valid__encs.pt \  # obtained from running evaluation on trained pairwise model 

--parallel 0 \

--set-mode 0   # keep duplicates

```

### Non-neural baselines

TF-IDF example:

```bash

python naturalproofs/baselines.py \

--method tfidf \

--datapath /path/to/_tokenized__bert-base-cased_200.pkl \

--datapath-base /path/to/naturalproofs_.json \

--savedir /path/to/out/

==> /path/to/out/tfidf__eval.pkl

```

Then use `analyze.py` to compute metrics:

```bash

python naturalproofs/analyze.py \

--method tfidf \

--eval-path /path/to/out/tfidf__eval.pkl \

--datapath-base /path/to/naturalproofs_.json 

==> /path/to/out/tfidf__analysis.pkl

```