https://github.com/jquesnelle/sat-reading

Last synced: 6 months ago
JSON representation

Host: GitHub
URL: https://github.com/jquesnelle/sat-reading
Owner: jquesnelle
Created: 2023-02-14T04:16:06.000Z (over 2 years ago)
Default Branch: master
Last Pushed: 2023-02-20T03:58:02.000Z (over 2 years ago)
Last Synced: 2025-04-01T20:31:50.296Z (7 months ago)
Language: Python
Size: 289 KB
Stars: 19
Watchers: 3
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesome-ChatGPT-repositories - sat-reading - new blog: language models vs. the sat reading test! they score ~90%, and flan-t5 does as well as gpt-3.5! finetuning even better!all the deets: available here, including a new huggingface dataset with questions (+models): (NLP)

README

          This repository contains the code used to produce the models and data in the blog post [Language Models vs. The SAT Reading Test](https://jeffq.com/blog/language-models-vs-the-sat-reading-test).

Dataset: [emozilla/sat-reading](https://huggingface.co/datasets/emozilla/sat-reading)

Models: [XXL (11B)](https://huggingface.co/emozilla/flan-t5-xxl-sat-reading) [XL (3B)](https://huggingface.co/emozilla/flan-t5-xl-sat-reading) [Large (780M)](https://huggingface.co/emozilla/flan-t5-large-sat-reading) [Base (350M)](https://huggingface.co/emozilla/flan-t5-base-sat-reading)

| File | Description |

| ---- | ----------- |

| [combine-raw-data.py](combine-raw-data.py) | Combine data in `raw-data` folder into a single JSON |

| [create-dataset.py](create-dataset.py) | Create [datasets](https://github.com/huggingface/datasets)-compatible datasets from combined JSON |

| [process-dataset-for-training.py](process-dataset-for-training.py) | Create a tokenized version of an existing dataset for training |

| [prompt-loop.py](promot-loop.py) | Playground for loading and prompting models |

| [take-tests.py](take-tests.py) | Evaluate models against a dataset |

| [train.py](train.py) | Finetune a FLAN-T5 model |

To check the generalization of finetuned models, install [lm-evaluation-harness](https://github.com/bigscience-workshop/lm-evaluation-harness) and run it on the `SuperGLUE` metrics: `cb`, `copa`, `superglue_rte`, `wic`, and `wsc` (and any other metrics you'd like, of course).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jquesnelle/sat-reading

Awesome Lists containing this project

README