Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jaymody/picogpt
An unnecessarily tiny implementation of GPT-2 in NumPy.
https://github.com/jaymody/picogpt
deep-learning gpt gpt-2 large-language-models machine-learning neural-network nlp python
Last synced: 24 days ago
JSON representation
An unnecessarily tiny implementation of GPT-2 in NumPy.
- Host: GitHub
- URL: https://github.com/jaymody/picogpt
- Owner: jaymody
- License: mit
- Created: 2023-01-21T21:07:13.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-04-24T20:05:53.000Z (over 1 year ago)
- Last Synced: 2024-10-15T13:00:42.633Z (24 days ago)
- Topics: deep-learning, gpt, gpt-2, large-language-models, machine-learning, neural-network, nlp, python
- Language: Python
- Homepage:
- Size: 13.7 KB
- Stars: 3,213
- Watchers: 28
- Forks: 414
- Open Issues: 11
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-ChatGPT-repositories - picoGPT - An unnecessarily tiny implementation of GPT-2 in NumPy. (Reimplementations)
README
# PicoGPT
Accompanying blog post: [GPT in 60 Lines of Numpy](https://jaykmody.com/blog/gpt-from-scratch/)---
You've seen [openai/gpt-2](https://github.com/openai/gpt-2).
You've seen [karpathy/minGPT](https://github.com/karpathy/mingpt).
You've even seen [karpathy/nanoGPT](https://github.com/karpathy/nanogpt)!
But have you seen [picoGPT](https://github.com/jaymody/picoGPT)??!?
`picoGPT` is an unnecessarily tiny and minimal implementation of [GPT-2](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) in plain [NumPy](https://numpy.org). The entire forward pass code is [40 lines of code](https://github.com/jaymody/picoGPT/blob/main/gpt2_pico.py#L3-L41).
picoGPT features:
* Fast? ❌ Nah, picoGPT is megaSLOW 🐌
* Training code? ❌ Error, 4️⃣0️⃣4️⃣ not found
* Batch inference? ❌ picoGPT is civilized, single file line, one at a time only
* top-p sampling? ❌ top-k? ❌ temperature? ❌ categorical sampling?! ❌ greedy? ✅
* Readable? `gpt2.py` ✅ `gpt2_pico.py` ❌
* Smol??? ✅✅✅✅✅✅ YESS!!! TEENIE TINY in fact 🤏A quick breakdown of each of the files:
* `encoder.py` contains the code for OpenAI's BPE Tokenizer, taken straight from their [gpt-2 repo](https://github.com/openai/gpt-2/blob/master/src/encoder.py).
* `utils.py` contains the code to download and load the GPT-2 model weights, tokenizer, and hyper-parameters.
* `gpt2.py` contains the actual GPT model and generation code which we can run as a python script.
* `gpt2_pico.py` is the same as `gpt2.py`, but in even fewer lines of code. Why? Because why not 😎👍.#### Dependencies
```bash
pip install -r requirements.txt
```
Tested on `Python 3.9.10`.#### Usage
```bash
python gpt2.py "Alan Turing theorized that computers would one day become"
```Which generates
```
the most powerful machines on the planet.The computer is a machine that can perform complex calculations, and it can perform these calculations in a way that is very similar to the human brain.
```You can also control the number of tokens to generate, the model size (one of `["124M", "355M", "774M", "1558M"]`), and the directory to save the models:
```bash
python gpt2.py \
"Alan Turing theorized that computers would one day become" \
--n_tokens_to_generate 40 \
--model_size "124M" \
--models_dir "models"
```