Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/nikolaydubina/basic-openai-pytorch-server

Minimal HTTP inference server in OpenAI API with Pytorch and CUDA
https://github.com/nikolaydubina/basic-openai-pytorch-server

cuda docker llm openai pytorch server

Last synced: 4 days ago
JSON representation

Minimal HTTP inference server in OpenAI API with Pytorch and CUDA

Awesome Lists containing this project

README

        

Minimal HTTP inference server in OpenAI API[^1].

[![Hits](https://hits.sh/github.com/nikolaydubina/basic-openai-pytorch-server.svg?view=today-total&extraCount=40)](https://hits.sh/github.com/nikolaydubina/basic-openai-pytorch-server/)

> _When you don't want to install countless frameworks, generators, etc. When all you need is small Docker file and single main for http server._

> [!WARNING]
> Limited OpenAI API compatibility.

- 100 lines of code
- CUDA
- Pytorch
- HuggingFace models (e.g. Llama 3.2 11B Vision)
- OpenTelemetry
- JSON schema output
- 150 token/s: NVIDIA L4 (GCP `g2-standard-8`) Llama 3.2 11B Vision Instruct

[^1]: https://github.com/openai/openai-openapi