An open API service indexing awesome lists of open source software.

https://github.com/winstxnhdw/llm-api

A fast CPU-based API for Llama 3.2 using CTranslate2, hosted on Hugging Face Spaces.
https://github.com/winstxnhdw/llm-api

ctranslate2 docker huggingface huggingface-spaces llama transformers uv

Last synced: about 2 months ago
JSON representation

A fast CPU-based API for Llama 3.2 using CTranslate2, hosted on Hugging Face Spaces.

Awesome Lists containing this project

README

        

# llm-api

[![build.yml](https://github.com/winstxnhdw/llm-api/actions/workflows/main.yml/badge.svg)](https://github.com/winstxnhdw/llm-api/actions/workflows/main.yml)
[![deploy.yml](https://github.com/winstxnhdw/llm-api/actions/workflows/deploy.yml/badge.svg)](https://github.com/winstxnhdw/llm-api/actions/workflows/deploy.yml)
[![formatter.yml](https://github.com/winstxnhdw/llm-api/actions/workflows/formatter.yml/badge.svg)](https://github.com/winstxnhdw/llm-api/actions/workflows/formatter.yml)
[![warmer.yml](https://github.com/winstxnhdw/llm-api/actions/workflows/warmer.yml/badge.svg)](https://github.com/winstxnhdw/llm-api/actions/workflows/warmer.yml)

[![Open in Spaces](https://huggingface.co/datasets/huggingface/badges/raw/main/open-in-hf-spaces-md-dark.svg)](https://huggingface.co/spaces/winstxnhdw/llm-api)
[![Open a Pull Request](https://huggingface.co/datasets/huggingface/badges/raw/main/open-a-pr-md-dark.svg)](https://github.com/winstxnhdw/llm-api/compare)

A fast CPU-based API for Llama-3.2, hosted on Hugging Face Spaces. To achieve faster executions, we are using [CTranslate2](https://github.com/OpenNMT/CTranslate2) as our inference engine.

## Usage

Simply cURL the endpoint like in the following.

```bash
curl -N 'https://winstxnhdw-llm-api.hf.space/api/v1/chat' \
-H 'Content-Type: application/json' \
-d \
'{
"instruction": "What is the capital of Japan?"
}'
```

## Development

There are a few ways to run `llm-api` locally for development.

### Local

If you spin up the server using `uv`, you may access the Swagger UI at [localhost:49494/schema/swagger](http://localhost:49494/schema/swagger).

```bash
uv run llm-api
```

### Docker

You can access the Swagger UI at [localhost:7860/schema/swagger](http://localhost:7860/schema/swagger) after spinning the server up with Docker.

```bash
docker build -f Dockerfile.build -t llm-api .
docker run --rm -e APP_PORT=7860 -p 7860:7860 llm-api
```