https://github.com/winstxnhdw/llm-api

A fast CPU-based API for Llama 3.2 using CTranslate2, hosted on Hugging Face Spaces.
https://github.com/winstxnhdw/llm-api

ctranslate2 docker huggingface huggingface-spaces llama transformers uv

Last synced: 5 months ago
JSON representation

A fast CPU-based API for Llama 3.2 using CTranslate2, hosted on Hugging Face Spaces.

Host: GitHub
URL: https://github.com/winstxnhdw/llm-api
Owner: winstxnhdw
Created: 2023-12-03T10:13:57.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2025-06-23T00:00:34.000Z (5 months ago)
Last Synced: 2025-06-23T00:29:12.825Z (5 months ago)
Topics: ctranslate2, docker, huggingface, huggingface-spaces, llama, transformers, uv
Language: Python
Homepage: https://huggingface.co/spaces/winstxnhdw/llm-api
Size: 851 KB
Stars: 0
Watchers: 2
Forks: 2
Open Issues: 1
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # llm-api

[![build.yml](https://github.com/winstxnhdw/llm-api/actions/workflows/main.yml/badge.svg)](https://github.com/winstxnhdw/llm-api/actions/workflows/main.yml)

[![deploy.yml](https://github.com/winstxnhdw/llm-api/actions/workflows/deploy.yml/badge.svg)](https://github.com/winstxnhdw/llm-api/actions/workflows/deploy.yml)

[![formatter.yml](https://github.com/winstxnhdw/llm-api/actions/workflows/formatter.yml/badge.svg)](https://github.com/winstxnhdw/llm-api/actions/workflows/formatter.yml)

[![Open in Spaces](https://huggingface.co/datasets/huggingface/badges/raw/main/open-in-hf-spaces-md-dark.svg)](https://huggingface.co/spaces/winstxnhdw/llm-api)

[![Open a Pull Request](https://huggingface.co/datasets/huggingface/badges/raw/main/open-a-pr-md-dark.svg)](https://github.com/winstxnhdw/llm-api/compare)

A fast CPU-based API for Llama-3.2, hosted on Hugging Face Spaces. To achieve faster executions, we are using [CTranslate2](https://github.com/OpenNMT/CTranslate2) as our inference engine.

## Usage

Simply cURL the endpoint like in the following.

```bash

curl -N 'https://winstxnhdw-llm-api.hf.space/api/v1/chat' \

     -H 'Content-Type: application/json' \

     -d \

     '{

         "messages": [

             {

                 "role": "user",

                 "content": "What is the capital of France?"

             }

         ]

      }'

```

## Development

There are a few ways to run `llm-api` locally for development.

### Local

If you spin up the server using `uv`, you may access the Swagger UI at [localhost:49494/schema/swagger](http://localhost:49494/schema/swagger).

```bash

uv run llm-api

```

### Docker

You can access the Swagger UI at [localhost:7860/schema/swagger](http://localhost:7860/schema/swagger) after spinning the server up with Docker.

```bash

docker build -f Dockerfile.build -t llm-api .

docker run --rm -e SERVER_PORT=7860 -p 7860:7860 llm-api

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/winstxnhdw/llm-api

Awesome Lists containing this project

README