https://github.com/synw/goinfer
Lightweight inference server for local language models
https://github.com/synw/goinfer
Last synced: about 2 months ago
JSON representation
Lightweight inference server for local language models
- Host: GitHub
- URL: https://github.com/synw/goinfer
- Owner: synw
- License: mit
- Created: 2023-08-03T10:32:55.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-11-04T14:32:38.000Z (over 1 year ago)
- Last Synced: 2025-04-14T19:08:38.042Z (about 2 months ago)
- Language: Go
- Homepage: https://synw.github.io/goinfer/
- Size: 428 KB
- Stars: 5
- Watchers: 4
- Forks: 2
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Goinfer
Inference api server for local gguf language models. Based on [Llama.cpp](https://github.com/ggerganov/llama.cpp)
- **Multi models**: switch between models at runtime
- **Inference queries**: http api and streaming response support
- **Tasks**: predefined language model tasksWorks with the [Infergui](https://github.com/synw/infergui) frontend
:books: Read the documentation
- [Get started](https://synw.github.io/goinfer/get_started)
- [Install](https://synw.github.io/goinfer/get_started/install)
- [Configure](https://synw.github.io/goinfer/get_started/configure)
- [Run](https://synw.github.io/goinfer/get_started/run)
- [Llama api](https://synw.github.io/goinfer/llama_api)
- [Models state](https://synw.github.io/goinfer/llama_api/models_state)
- [Load model](https://synw.github.io/goinfer/llama_api/load_model)
- [Inference](https://synw.github.io/goinfer/llama_api/inference)
- [Tasks](https://synw.github.io/goinfer/llama_api/tasks)
- [Templates](https://synw.github.io/goinfer/llama_api/templates)
- [Openai api](https://synw.github.io/goinfer/openai_api)
- [Configure](https://synw.github.io/goinfer/openai_api/configure)
- [Endpoints](https://synw.github.io/goinfer/openai_api/endpoints)