https://github.com/statusfailed/catgpt
https://github.com/statusfailed/catgpt
Last synced: 6 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/statusfailed/catgpt
- Owner: statusfailed
- Created: 2024-03-22T09:34:56.000Z (almost 2 years ago)
- Default Branch: master
- Last Pushed: 2024-04-19T11:39:35.000Z (almost 2 years ago)
- Last Synced: 2025-04-14T05:17:47.823Z (11 months ago)
- Language: Python
- Size: 55.7 KB
- Stars: 9
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# catgpt
A GPT model implemented with [catgrad](https://github.com/statusfailed/catgrad).
NOTE: requires `catgrad 0.2.0`
# Run
Install dependencies
python -m venv venv
source venv/bin/activate
pip install torch catgpt==0.2.0
Get some pretrained model weights:
./get_weights.sh
Generate some text
python generate.py
# Architecture
The architecture is a **very** stripped-down [nanoGPT](https://github.com/karpathy/nanoGPT).
Several layers have been removed which impact the quality of generated text.
In order of importance, the removed layers are:
- [ ] Positional encodings (!)
- [ ] self-attention output layer
- [ ] `FeedForward` after attention in each `block`
- [ ] Learnable layer norm weights