https://github.com/andrewn6/tinyllm

Minimal, fast inference engine for LLM's
https://github.com/andrewn6/tinyllm

Last synced: 2 months ago
JSON representation

Minimal, fast inference engine for LLM's

Host: GitHub
URL: https://github.com/andrewn6/tinyllm
Owner: andrewn6
Created: 2024-11-19T17:21:41.000Z (12 months ago)
Default Branch: main
Last Pushed: 2024-12-27T07:11:04.000Z (11 months ago)
Last Synced: 2024-12-27T08:22:02.744Z (11 months ago)
Language: Python
Size: 48.9 MB
Stars: 5
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.MD

Awesome Lists containing this project

README

# TinyLLM

Minimal, high-performance inference engine for LLM's -- used in development environments

## Overview
TinyLLM streamlines the inference pipeline with minimal overhead, focusing on memory efficiency and throughput optimization. We include a custom tokenizer for self-developed models, and it's compataibile with existing LLM's through our scheduling systme.

## Features
- Memory managment pruning
- Efficient batch processing and response streaming
- Optimized scheduling for multi-model deployments
- Custom tokenizer implmentation for self-developed models
- Inference API
- KV cache implementation
- Training CLI for development models
- Byte-level tokenization

*This is very much still an experiment, especially the tokenizer, our scheduler is somewhat well-written, memory management is decent.*

I'll continue to slowly improve these components over my weekends.

## Scope
This is solely a inference engine. It does not:
- Implement large model architectures
- Include pre-trained models
- Support distributed training

## How to use?

Clone repository
```
git clone https://github.com/andrewn6/tinyllm
```
```
pip install -e .
```

Register your trained model
```
tinyllm model register transformer-19m v1 \
--checkpoint models/tiny-19m.pt \
--model-type native \
--description "19M parameter transformer"
```

Serve and expose to localhost
```
tinyllm serve \
--model-name mymodel \
--port 8000 \
--model-type native
```

List models
```
tinyllm model list
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/andrewn6/tinyllm

Awesome Lists containing this project

README