https://github.com/Mxoder/LLM-from-scratch

一些 LLM 方面的从零复现笔记
https://github.com/Mxoder/LLM-from-scratch

Last synced: 12 months ago
JSON representation

一些 LLM 方面的从零复现笔记

Host: GitHub
URL: https://github.com/Mxoder/LLM-from-scratch
Owner: Mxoder
License: apache-2.0
Created: 2024-04-29T02:33:03.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2025-04-29T07:47:51.000Z (about 1 year ago)
Last Synced: 2025-04-29T08:39:36.203Z (about 1 year ago)
Language: Jupyter Notebook
Homepage:
Size: 1.77 MB
Stars: 186
Watchers: 5
Forks: 26
Open Issues: 3
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

StarryDivineSky - Mxoder/TinyStories

README

          # LLM-from-scratch

一些 LLM 的从零复现笔记，包括一些思考文章。

- [x] 1. 从头预训练一只超迷你 LLaMA 3——复现 TinyStories

- [x] 2. 用 PyTorch 从零实现 LoRA

- [ ] 3. 从零实现 `generate` 方法

## 知乎链接

1. [从头预训练一只超迷你 LLaMA 3——复现 TinyStories](https://zhuanlan.zhihu.com/p/695130168)

2. [用 PyTorch 从零实现 LoRA](https://zhuanlan.zhihu.com/p/702419731)

3. [Qwen2.5-Math 技术报告详细解读](https://zhuanlan.zhihu.com/p/721015204)

4. [Qwen2.5-Coder 技术报告详细解读](https://zhuanlan.zhihu.com/p/721189499)

5. [我的 api 调用太慢了！LLM api 的异步调用加速](https://zhuanlan.zhihu.com/p/1896894945463362125)

6. [Qwen3是如何实现混合推理（快慢思考）的？](https://zhuanlan.zhihu.com/p/1900555481715570305)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/Mxoder/LLM-from-scratch

Awesome Lists containing this project

README