Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jquesnelle/sat-reading
https://github.com/jquesnelle/sat-reading
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/jquesnelle/sat-reading
- Owner: jquesnelle
- Created: 2023-02-14T04:16:06.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2023-02-20T03:58:02.000Z (over 1 year ago)
- Last Synced: 2024-06-15T11:33:53.090Z (5 months ago)
- Language: Python
- Size: 289 KB
- Stars: 20
- Watchers: 3
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-ChatGPT-repositories - sat-reading - new blog: language models vs. the sat reading test! they score ~90%, and flan-t5 does as well as gpt-3.5! finetuning even better!all the deets: available here, including a new huggingface dataset with questions (+models): (NLP)
README
This repository contains the code used to produce the models and data in the blog post [Language Models vs. The SAT Reading Test](https://jeffq.com/blog/language-models-vs-the-sat-reading-test).
Dataset: [emozilla/sat-reading](https://huggingface.co/datasets/emozilla/sat-reading)
Models: [XXL (11B)](https://huggingface.co/emozilla/flan-t5-xxl-sat-reading) [XL (3B)](https://huggingface.co/emozilla/flan-t5-xl-sat-reading) [Large (780M)](https://huggingface.co/emozilla/flan-t5-large-sat-reading) [Base (350M)](https://huggingface.co/emozilla/flan-t5-base-sat-reading)| File | Description |
| ---- | ----------- |
| [combine-raw-data.py](combine-raw-data.py) | Combine data in `raw-data` folder into a single JSON |
| [create-dataset.py](create-dataset.py) | Create [datasets](https://github.com/huggingface/datasets)-compatible datasets from combined JSON |
| [process-dataset-for-training.py](process-dataset-for-training.py) | Create a tokenized version of an existing dataset for training |
| [prompt-loop.py](promot-loop.py) | Playground for loading and prompting models |
| [take-tests.py](take-tests.py) | Evaluate models against a dataset |
| [train.py](train.py) | Finetune a FLAN-T5 model |To check the generalization of finetuned models, install [lm-evaluation-harness](https://github.com/bigscience-workshop/lm-evaluation-harness) and run it on the `SuperGLUE` metrics: `cb`, `copa`, `superglue_rte`, `wic`, and `wsc` (and any other metrics you'd like, of course).