An open API service indexing awesome lists of open source software.

https://github.com/inftyai/puma

Aim to be a lightweight, high-performance inference engine for heterogeneous devices. WIP.
https://github.com/inftyai/puma

llm llm-inference rust

Last synced: about 1 year ago
JSON representation

Aim to be a lightweight, high-performance inference engine for heterogeneous devices. WIP.

Awesome Lists containing this project

README

          

# PUMA

Puma aims to be a lightweight, high-performance inference engine for heterogeneous devices. *Currently under active development.*

## How to Run

### Build

Run `make build` to build the **puma** binary.

### Run

Run `./puma help` to see all available commands.

For example, you can run `./puma version` to see the binary version.

## Supported Backends

Use [llama.cpp](https://github.com/ggerganov/llama.cpp) as the default backend for quick prototyping, will implement our own backend in the future.