https://github.com/yinqiwen/lmsf

Last synced: 6 months ago
JSON representation

Host: GitHub
URL: https://github.com/yinqiwen/lmsf
Owner: yinqiwen
Created: 2023-06-29T08:19:44.000Z (over 2 years ago)
Default Branch: rust
Last Pushed: 2024-04-08T11:38:44.000Z (over 1 year ago)
Last Synced: 2025-03-29T04:25:16.598Z (7 months ago)
Language: Cuda
Size: 1.07 MB
Stars: 4
Watchers: 3
Forks: 0
Open Issues: 2
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

Rust LLM Serving Framework

## Features

- Paged Attention
- Continuous Batch
- Quantization
- awq
- squeezellm
- Models
- llama
- gemma
- chatglm

# Getting Started

**Examples**
```sh
$ cargo run --release --example llm_engine_example -- --model --gpu-memory-utilization 0.95 --block-size 8 --max-model-len 1024
```

**API Server**
```sh
$ cargo build --release
$ ./target/release/entrypoints --model --gpu-memory-utilization 0.95 --block-size 8 --max-model-len 1024 --host 0.0.0.0 --port 8000
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/yinqiwen/lmsf

Awesome Lists containing this project

README