Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/raj-pulapakura/gpt-from-scratch
My implementation of Andjrey Karpathy's "Neural Networks: Zero to Hero" series
https://github.com/raj-pulapakura/gpt-from-scratch
Last synced: 18 days ago
JSON representation
My implementation of Andjrey Karpathy's "Neural Networks: Zero to Hero" series
- Host: GitHub
- URL: https://github.com/raj-pulapakura/gpt-from-scratch
- Owner: raj-pulapakura
- Created: 2024-01-21T15:20:22.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-01-26T11:21:26.000Z (12 months ago)
- Last Synced: 2024-11-10T00:29:48.926Z (3 months ago)
- Language: Jupyter Notebook
- Size: 1.15 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
![techno chameleon](https://github.com/raj-pulapakura/gpt-from-scratch/assets/87762282/ba92a155-9580-404d-b723-4c4147ba2d9a)
Andrej Karpathy has a wonderful series in which he builds out neural networks and language models from scratch, all the way up to implementing a Transformer! Check out [his playlist](
https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ).# Makemore (neural network fundamentals)
| Notebook | Andjrey's Video |
| ----------- | ----------- |
| building makemore part 1.ipynb | [The spelled-out intro to language modeling: building makemore](https://www.youtube.com/watch?v=PaCmpygFfXo&list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ&index=2) |
| building makemore part 2 MLP.ipynb | [Building makemore Part 2: MLP](https://www.youtube.com/watch?v=TCH_1BHY58I&list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ&index=3) |
| building makemore part 3 activations,gradients,batchnorm.ipynb | [Building makemore Part 3: Activations & Gradients, BatchNorm](https://www.youtube.com/watch?v=PaCmpygFfXo&list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ&index=2) |
| building makemore part 5 wavenet.ipynb | [Building makemore Part 5: Building a WaveNet](https://www.youtube.com/watch?v=t3YJ5hKiMQ0&list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ&index=6) |# GPT
In Andrey's [*Let's build GPT: from scratch, in code, spelled out.*](https://www.youtube.com/watch?v=kCc8FmEb1nY&list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ&index=7), we implemented a GPT (Generative Pretrained Transformer) from scratch and train it on a corpus of Shakespeare Text. You can find the code for my implementation in the `gpt` folder.
| File | Description |
| ----------- | ----------- |
| bigram 1.py | Implemented a bigram model, which uses the immediately previous char to predict the next char |
| gpt 2 with attention.py | Added Scaled Dot-Product Self-Attention to the model |
| gpt 3 ffwd.py | Continued the implementation of the Transformer with Feed Forward Neural nets |
| gpt 4 transformer blocks.py | Built transformer blocks containing multi-head attention and ffwd |
| gpt 5 residual connections.py | Added residual connections to mitigate vanishing gradients |
| gpt 6 layernorm.py | Added Layer Normalization |
| gpt 7 scaling up.py | Basically scaled up every hyperparameter of the network, achieves pretty decent generations (Train on a GPU) |
| gpt-dev.ipynb | Notebook which explains self-attention |