Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/karpathy/LLM101n
LLM101n: Let's build a Storyteller
https://github.com/karpathy/LLM101n
Last synced: 2 months ago
JSON representation
LLM101n: Let's build a Storyteller
- Host: GitHub
- URL: https://github.com/karpathy/LLM101n
- Owner: karpathy
- Archived: true
- Created: 2024-05-27T00:23:38.000Z (8 months ago)
- Default Branch: master
- Last Pushed: 2024-08-01T01:20:33.000Z (5 months ago)
- Last Synced: 2024-09-30T21:41:18.033Z (3 months ago)
- Homepage:
- Size: 269 KB
- Stars: 29,043
- Watchers: 2,219
- Forks: 1,591
- Open Issues: 19
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- StarryDivineSky - karpathy/LLM101n
- Awesome-LLM - LLM101n - Let's build a Storyteller. (Trending LLM Projects)
- awesome-LLM-resourses - LLM101n
- awesome-llm-and-aigc - karpathy/LLM101n - to-end from basics to a functioning web app similar to ChatGPT, from scratch in Python, C and CUDA, and with minimal computer science prerequisits. By the end you should have a relatively deep understanding of AI, LLMs, and deep learning more generally. (Summary)
- awesome-llm-and-aigc - karpathy/LLM101n - to-end from basics to a functioning web app similar to ChatGPT, from scratch in Python, C and CUDA, and with minimal computer science prerequisits. By the end you should have a relatively deep understanding of AI, LLMs, and deep learning more generally. (Summary)
- awesome-ai-papers - [LLM101n - course](https://github.com/mlabonne/llm-course)\]\[[intro-llm](https://intro-llm.github.io/)\]\[[llm-cookbook](https://github.com/datawhalechina/llm-cookbook)\]\[[hugging-llm](https://github.com/datawhalechina/hugging-llm)\]\[[generative-ai-for-beginners](https://github.com/microsoft/generative-ai-for-beginners)\]\[[awesome-generative-ai-guide](https://github.com/aishwaryanr/awesome-generative-ai-guide)\]\[[LLMs-from-scratch](https://github.com/rasbt/LLMs-from-scratch)\]\[[llm-action](https://github.com/liguodongiot/llm-action)\]\[[llms_idx](https://dongnian.icu/llms/llms_idx/)\]\[[tiny-universe](https://github.com/datawhalechina/tiny-universe)\]\[[AISystem](https://github.com/chenzomi12/AISystem)\] (NLP / 3. Pretraining)
- awesome-ai-papers - [LLM101n - course](https://github.com/mlabonne/llm-course)\]\[[intro-llm](https://intro-llm.github.io/)\]\[[llm-cookbook](https://github.com/datawhalechina/llm-cookbook)\]\[[hugging-llm](https://github.com/datawhalechina/hugging-llm)\]\[[generative-ai-for-beginners](https://github.com/microsoft/generative-ai-for-beginners)\]\[[awesome-generative-ai-guide](https://github.com/aishwaryanr/awesome-generative-ai-guide)\]\[[LLMs-from-scratch](https://github.com/rasbt/LLMs-from-scratch)\]\[[llm-action](https://github.com/liguodongiot/llm-action)\]\[[llms_idx](https://dongnian.icu/llms/llms_idx/)\]\[[tiny-universe](https://github.com/datawhalechina/tiny-universe)\] (NLP / 3. Pretraining)
- alan_awesome_llm - LLM101n
- alan_awesome_llm - LLM101n
README
# LLM101n: Let's build a Storyteller
---
**!!! NOTE: this course does not yet exist. It is current being developed by [Eureka Labs](https://eurekalabs.ai). Until it is ready I am archiving this repo !!!**
---
![LLM101n header image](llm101n.jpg)
> What I cannot create, I do not understand. -Richard Feynman
In this course we will build a Storyteller AI Large Language Model (LLM). Hand in hand, you'll be able to create, refine and illustrate little [stories](https://huggingface.co/datasets/roneneldan/TinyStories) with the AI. We are going to build everything end-to-end from basics to a functioning web app similar to ChatGPT, from scratch in Python, C and CUDA, and with minimal computer science prerequisites. By the end you should have a relatively deep understanding of AI, LLMs, and deep learning more generally.
**Syllabus**
- Chapter 01 **Bigram Language Model** (language modeling)
- Chapter 02 **Micrograd** (machine learning, backpropagation)
- Chapter 03 **N-gram model** (multi-layer perceptron, matmul, gelu)
- Chapter 04 **Attention** (attention, softmax, positional encoder)
- Chapter 05 **Transformer** (transformer, residual, layernorm, GPT-2)
- Chapter 06 **Tokenization** (minBPE, byte pair encoding)
- Chapter 07 **Optimization** (initialization, optimization, AdamW)
- Chapter 08 **Need for Speed I: Device** (device, CPU, GPU, ...)
- Chapter 09 **Need for Speed II: Precision** (mixed precision training, fp16, bf16, fp8, ...)
- Chapter 10 **Need for Speed III: Distributed** (distributed optimization, DDP, ZeRO)
- Chapter 11 **Datasets** (datasets, data loading, synthetic data generation)
- Chapter 12 **Inference I: kv-cache** (kv-cache)
- Chapter 13 **Inference II: Quantization** (quantization)
- Chapter 14 **Finetuning I: SFT** (supervised finetuning SFT, PEFT, LoRA, chat)
- Chapter 15 **Finetuning II: RL** (reinforcement learning, RLHF, PPO, DPO)
- Chapter 16 **Deployment** (API, web app)
- Chapter 17 **Multimodal** (VQVAE, diffusion transformer)**Appendix**
Further topics to work into the progression above:
- Programming languages: Assembly, C, Python
- Data types: Integer, Float, String (ASCII, Unicode, UTF-8)
- Tensor: shapes, views, strides, contiguous, ...
- Deep Learning frameworks: PyTorch, JAX
- Neural Net Architecture: GPT (1,2,3,4), Llama (RoPE, RMSNorm, GQA), MoE, ...
- Multimodal: Images, Audio, Video, VQVAE, VQGAN, diffusion