https://github.com/teleprint-me/tiny
A super simple transformer implementation
https://github.com/teleprint-me/tiny
Last synced: about 2 months ago
JSON representation
A super simple transformer implementation
- Host: GitHub
- URL: https://github.com/teleprint-me/tiny
- Owner: teleprint-me
- Created: 2025-03-07T05:18:26.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2025-03-07T06:05:48.000Z (2 months ago)
- Last Synced: 2025-03-07T06:27:47.061Z (2 months ago)
- Language: Python
- Size: 13.7 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Tiny
Tiny is a super simple Transformer implementation for debugging
[Mini](https://github.com/teleprint-me/mini.git).The Transformer model is actually rather simple in implementation. The
complexity arises from the surrounding tooling and pipeline.Tiny is designed to simplify that pipeline down to its core fundementals.
## **Installation & Setup**
### **1. Clone the repository**
```sh
git clone https://github.com/teleprint-me/mini.git
cd mini
```### **2. Setup a virtual environment**
```sh
python3.12 -m venv .venv
source .venv/bin/activate
```### **3. Install dependencies**
#### **Install PyTorch**
- **CUDA**
```sh
pip install torch
```_PyTorch defaults to CUDA._
- **CPU**
```sh
pip install torch --index-url https://download.pytorch.org/whl/cpu
```- **ROCm**
```sh
pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/rocm6.2.4
```#### **Install Requirements**
```sh
pip install -r requirements.txt
```## **Usage**
### **Dataset Preparation**
Download the hotpot dataset:
```sh
python -m tiny.data.hotpot --dataset dev \
--samples 100 \
--output data/hotpot.json
```_Samples are selected at random._
### **Pre-training**
Train a model from scratch on a dataset:
```sh
python -m tiny.trainer --dname cuda \
--vocab-path model/vocab.json \
--model-path model/tiny.pth \
--dataset-path data/hotpot.json \
--save-every 1
```