Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/karpathy/ng-video-lecture
https://github.com/karpathy/ng-video-lecture
Last synced: 27 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/karpathy/ng-video-lecture
- Owner: karpathy
- Created: 2023-01-17T05:27:03.000Z (almost 2 years ago)
- Default Branch: master
- Last Pushed: 2024-01-31T13:43:36.000Z (9 months ago)
- Last Synced: 2024-10-01T12:05:06.046Z (about 1 month ago)
- Language: Python
- Size: 430 KB
- Stars: 3,459
- Watchers: 56
- Forks: 902
- Open Issues: 35
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# nanogpt-lecture
Code created in the [Neural Networks: Zero To Hero](https://karpathy.ai/zero-to-hero.html) video lecture series, specifically on the first lecture on nanoGPT. Publishing here as a Github repo so people can easily hack it, walk through the `git log` history of it, etc.
NOTE: sadly I did not go too much into model initialization in the video lecture, but it is quite important for good performance. The current code will train and work fine, but its convergence is slower because it starts off in a not great spot in the weight space. Please see [nanoGPT model.py](https://github.com/karpathy/nanoGPT/blob/master/model.py) for `# init all weights` comment, and especially how it calls the `_init_weights` function. Even more sadly, the code in this repo is a bit different in how it names and stores the various modules, so it's not possible to directly copy paste this code here. My current plan is to publish a supplementary video lecture and cover these parts, then I will also push the exact code changes to this repo. For now I'm keeping it as is so it is almost exactly what we actually covered in the video.
### License
MIT