Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/raj-pulapakura/gpt-from-scratch

My implementation of Andjrey Karpathy's "Neural Networks: Zero to Hero" series
https://github.com/raj-pulapakura/gpt-from-scratch

Last synced: 18 days ago
JSON representation

My implementation of Andjrey Karpathy's "Neural Networks: Zero to Hero" series

Host: GitHub
URL: https://github.com/raj-pulapakura/gpt-from-scratch
Owner: raj-pulapakura
Created: 2024-01-21T15:20:22.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-01-26T11:21:26.000Z (12 months ago)
Last Synced: 2024-11-10T00:29:48.926Z (3 months ago)
Language: Jupyter Notebook
Size: 1.15 MB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        ![techno chameleon](https://github.com/raj-pulapakura/gpt-from-scratch/assets/87762282/ba92a155-9580-404d-b723-4c4147ba2d9a)

Andrej Karpathy has a wonderful series in which he builds out neural networks and language models from scratch, all the way up to implementing a Transformer! Check out [his playlist](

https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ).

# Makemore (neural network fundamentals)

| Notebook       | Andjrey's Video |

| ----------- | ----------- |

| building makemore part 1.ipynb      | [The spelled-out intro to language modeling: building makemore](https://www.youtube.com/watch?v=PaCmpygFfXo&list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ&index=2)       |

| building makemore part 2 MLP.ipynb   | [Building makemore Part 2: MLP](https://www.youtube.com/watch?v=TCH_1BHY58I&list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ&index=3)        |

| building makemore part 3 activations,gradients,batchnorm.ipynb | [Building makemore Part 3: Activations & Gradients, BatchNorm](https://www.youtube.com/watch?v=PaCmpygFfXo&list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ&index=2)       |

| building makemore part 5 wavenet.ipynb      | [Building makemore Part 5: Building a WaveNet](https://www.youtube.com/watch?v=t3YJ5hKiMQ0&list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ&index=6)       |

# GPT

In Andrey's [*Let's build GPT: from scratch, in code, spelled out.*](https://www.youtube.com/watch?v=kCc8FmEb1nY&list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ&index=7), we implemented a GPT (Generative Pretrained Transformer) from scratch and train it on a corpus of Shakespeare Text. You can find the code for my implementation in the `gpt` folder.

| File       | Description |

| ----------- | ----------- |

| bigram 1.py | Implemented a bigram model, which uses the immediately previous char to predict the next char |

| gpt 2 with attention.py | Added Scaled Dot-Product Self-Attention to the model |

| gpt 3 ffwd.py | Continued the implementation of the Transformer with Feed Forward Neural nets |

| gpt 4 transformer blocks.py | Built transformer blocks containing multi-head attention and ffwd |

| gpt 5 residual connections.py | Added residual connections to mitigate vanishing gradients |

| gpt 6 layernorm.py | Added Layer Normalization |

| gpt 7 scaling up.py | Basically scaled up every hyperparameter of the network, achieves pretty decent generations (Train on a GPU) |

| gpt-dev.ipynb | Notebook which explains self-attention |