https://github.com/dsnsgithub/mango-llm
Goal: Create a LLM that can produce English text using training (initially from stories, later web text).
https://github.com/dsnsgithub/mango-llm
Last synced: 27 days ago
JSON representation
Goal: Create a LLM that can produce English text using training (initially from stories, later web text).
- Host: GitHub
- URL: https://github.com/dsnsgithub/mango-llm
- Owner: dsnsgithub
- Created: 2026-04-20T22:07:58.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2026-05-17T06:36:34.000Z (about 1 month ago)
- Last Synced: 2026-05-17T06:43:30.074Z (about 1 month ago)
- Language: Python
- Homepage:
- Size: 142 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 🥠Mango LLM
> **M**y **A**nswers **N**eed **G**enuine **O**versight => **MANGO**
### **Work in progress.**
Goal: Create a LLM that can produce coherent English text (initially trained on stories, later web text), in the process learning machine learning concepts.
As many layers/parts of the LLM as possible are built from basic components and individual matrix multiplications (instead of using pre-made components).
AI was used to assist/help me understand LLM concepts, but almost all of the code in this repo was handwritten and loosely based off of GPT-2 and the original Attention is All You Need paper.
## Requirements
- [uv](https://docs.astral.sh/uv/)
uv can automatically install the required Python version (even if you don't have Python installed), along with any required packages.
On Linux and Windows, this LLM uses the CUDA accelerator. On macOS, it uses the default MPS (Metal Performance Shaders) accelerator for Apple Silicon if possible.
## Run
To train, download the required datasets from Kaggle:
Download `train.csv` and `validation.csv` and create/place the files in `./dataset/TinyStories`.
Link: https://www.kaggle.com/datasets/thedevastator/tinystories-narrative-classification/data
---
With [uv](https://docs.astral.sh/uv/) (recommended given the project config):
```bash
uv sync
uv run src/train.py
uv run src/run.py
```
## Layout
| Path | Role |
|------|------|
| `src/` | LLM code, main files being `src/run.py` and `src/train.py` |
| `dist/` | Contains trained LLM that can be run with `src/run.py` |
| `old/` | Contains original LLM, useful for beginners trying to understand the basics |