Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/serge-chat/serge
A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.
https://github.com/serge-chat/serge
alpaca docker fastapi llama llamacpp nginx python svelte sveltekit tailwindcss web
Last synced: about 1 month ago
JSON representation
A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.
- Host: GitHub
- URL: https://github.com/serge-chat/serge
- Owner: serge-chat
- License: apache-2.0
- Created: 2023-03-19T08:33:29.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-08-03T13:42:51.000Z (4 months ago)
- Last Synced: 2024-08-05T03:01:31.782Z (3 months ago)
- Topics: alpaca, docker, fastapi, llama, llamacpp, nginx, python, svelte, sveltekit, tailwindcss, web
- Language: Svelte
- Homepage: https://serge.chat
- Size: 2.84 MB
- Stars: 5,614
- Watchers: 53
- Forks: 402
- Open Issues: 28
-
Metadata Files:
- Readme: README.md
- License: LICENSE-APACHE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
- awesome-open-chatgpt - nsarrazin/serge
- StarryDivineSky - serge-chat/serge
- Awesome-LLM - Serge - a chat interface crafted with llama.cpp for running Alpaca models. No API keys, entirely self-hosted! (LLM Deployment)
- awesome-homelab - Serge - chat/serge?style=flat) ![Serge](https://img.shields.io/github/languages/top/serge-chat/serge?style=flat) | A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API. | (Apps / AI)
- Self-Hosting-Guide - Serge - hosted & dockerized, with an easy to use API. (Tools for Self-Hosting / Running Locally on Windows, MacOS, and Linux:)
README
# Serge - LLaMA made easy ๐ฆ
![License](https://img.shields.io/github/license/serge-chat/serge)
[![Discord](https://img.shields.io/discord/1088427963801948201?label=Discord)](https://discord.gg/62Hc6FEYQH)Serge is a chat interface crafted with [llama.cpp](https://github.com/ggerganov/llama.cpp) for running GGUF models. No API keys, entirely self-hosted!
- ๐ **SvelteKit** frontend
- ๐พ **[Redis](https://github.com/redis/redis)** for storing chat history & parameters
- โ๏ธ **FastAPI + LangChain** for the API, wrapping calls to [llama.cpp](https://github.com/ggerganov/llama.cpp) using the [python bindings](https://github.com/abetlen/llama-cpp-python)๐ฅ Demo:
[demo.webm](https://user-images.githubusercontent.com/25119303/226897188-914a6662-8c26-472c-96bd-f51fc020abf6.webm)
## โก๏ธ Quick start
๐ณ Docker:
```bash
docker run -d \
--name serge \
-v weights:/usr/src/app/weights \
-v datadb:/data/db/ \
-p 8008:8008 \
ghcr.io/serge-chat/serge:latest
```๐ Docker Compose:
```yaml
services:
serge:
image: ghcr.io/serge-chat/serge:latest
container_name: serge
restart: unless-stopped
ports:
- 8008:8008
volumes:
- weights:/usr/src/app/weights
- datadb:/data/db/volumes:
weights:
datadb:
```Then, just visit http://localhost:8008, You can find the API documentation at http://localhost:8008/api/docs
### ๐ Environment Variables
The following Environment Variables are available:
| Variable Name | Description | Default Value |
|-----------------------|---------------------------------------------------------|--------------------------------------|
| `SERGE_DATABASE_URL` | Database connection string | `sqlite:////data/db/sql_app.db` |
| `SERGE_JWT_SECRET` | Key for auth token encryption. Use a random string | `uF7FGN5uzfGdFiPzR` |
| `SERGE_SESSION_EXPIRY`| Duration in minutes before a user must reauthenticate | `60` |
| `NODE_ENV` | Node.js running environment | `production` |## ๐ฅ๏ธ Windows
Ensure you have Docker Desktop installed, WSL2 configured, and enough free RAM to run models.
## โ๏ธ Kubernetes
Instructions for setting up Serge on Kubernetes can be found in the [wiki](https://github.com/serge-chat/serge/wiki/Integrating-Serge-in-your-orchestration#kubernetes-example).
## ๐ง Supported Models
| Category | Models |
|:-------------:|:-------|
| **Alfred** | 40B-1023 |
| **BioMistral** | 7B |
| **Code** | 13B, 33B |
| **CodeLLaMA** | 7B, 7B-Instruct, 7B-Python, 13B, 13B-Instruct, 13B-Python, 34B, 34B-Instruct, 34B-Python |
| **Codestral** | 22B v0.1 |
| **Gemma** | 2B, 1.1-2B-Instruct, 7B, 1.1-7B-Instruct, 2-9B, 2-9B-Instruct, 2-27B, 2-27B-Instruct |
| **Gorilla** | Falcon-7B-HF-v0, 7B-HF-v1, Openfunctions-v1, Openfunctions-v2 |
| **Falcon** | 7B, 7B-Instruct, 11B, 40B, 40B-Instruct |
| **LLaMA 2** | 7B, 7B-Chat, 7B-Coder, 13B, 13B-Chat, 70B, 70B-Chat, 70B-OASST |
| **LLaMA 3** | 11B-Instruct, 13B-Instruct, 16B-Instruct |
| **LLaMA Pro** | 8B, 8B-Instruct |
| **Mathstral** | 7B |
| **Med42** | 70B, v2-8B, v2-70B |
| **Medalpaca** | 13B |
| **Medicine** | Chat, LLM |
| **Meditron** | 7B, 7B-Chat, 70B, 3-8B |
| **Meta-LlaMA-3** | 3-8B, 3.1-8B, 3.2-1B-Instruct, 3-8B-Instruct, 3.1-8B-Instruct, 3.2-3B-Instruct, 3-70B, 3.1-70B, 3-70B-Instruct, 3.1-70B-Instruct |
| **Mistral** | 7B-V0.1, 7B-Instruct-v0.2, 7B-OpenOrca, Nemo-Instruct |
| **MistralLite** | 7B |
| **Mixtral** | 8x7B-v0.1, 8x7B-Dolphin-2.7, 8x7B-Instruct-v0.1 |
| **Neural-Chat** | 7B-v3.3 |
| **Notus** | 7B-v1 |
| **Notux** | 8x7b-v1 |
| **Nous-Hermes 2** | Mistral-7B-DPO, Mixtral-8x7B-DPO, Mistral-8x7B-SFT |
| **OpenChat** | 7B-v3.5-1210? 8B-v3.6-20240522 |
| **OpenCodeInterpreter** | DS-6.7B, DS-33B, CL-7B, CL-13B, CL-70B |
| **OpenLLaMA** | 3B-v2, 7B-v2, 13B-v2 |
| **Orca 2** | 7B, 13B |
| **Phi** | 2-2.7B, 3-mini-4k-instruct, 3.1-mini-4k-instruct, 3.1-mini-128k-instruct,3.5-mini-instruct, 3-medium-4k-instruct, 3-medium-128k-instruct |
| **Python Code** | 13B, 33B |
| **PsyMedRP** | 13B-v1, 20B-v1 |
| **Starling LM** | 7B-Alpha |
| **SOLAR** | 10.7B-v1.0, 10.7B-instruct-v1.0 |
| **TinyLlama** | 1.1B |
| **Vicuna** | 7B-v1.5, 13B-v1.5, 33B-v1.3, 33B-Coder |
| **WizardLM** | 2-7B, 13B-v1.2, 70B-v1.0 |
| **Zephyr** | 3B, 7B-Alpha, 7B-Beta |Additional models can be requested by opening a GitHub issue. Other models are also available at [Serge Models](https://github.com/Smartappli/serge-models).
## โ ๏ธ Memory Usage
LLaMA will crash if you don't have enough available memory for the model
## ๐ฌ Support
Need help? Join our [Discord](https://discord.gg/62Hc6FEYQH)
## ๐งพ License
[Nathan Sarrazin](https://github.com/nsarrazin) and [Contributors](https://github.com/serge-chat/serge/graphs/contributors). `Serge` is free and open-source software licensed under the [MIT License](https://github.com/serge-chat/serge/blob/main/LICENSE-MIT) and [Apache-2.0](https://github.com/serge-chat/serge/blob/main/LICENSE-APACHE).
## ๐ค Contributing
If you discover a bug or have a feature idea, feel free to open an issue or PR.
To run Serge in development mode:
```bash
git clone https://github.com/serge-chat/serge.git
cd serge/
docker compose -f docker-compose.dev.yml up --build
```The solution will accept a python debugger session on port 5678. Example launch.json for VSCode:
```json
{
"version": "0.2.0",
"configurations": [
{
"name": "Remote Debug",
"type": "python",
"request": "attach",
"connect": {
"host": "localhost",
"port": 5678
},
"pathMappings": [
{
"localRoot": "${workspaceFolder}/api",
"remoteRoot": "/usr/src/app/api/"
}
],
"justMyCode": false
}
]
}
```