Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ninehills/LLM-Fusion-API
https://github.com/ninehills/LLM-Fusion-API
Last synced: 14 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/ninehills/LLM-Fusion-API
- Owner: ninehills
- License: mit
- Created: 2023-07-28T13:21:43.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-01-30T03:52:45.000Z (10 months ago)
- Last Synced: 2024-10-28T11:12:55.734Z (15 days ago)
- Language: Python
- Size: 31.3 KB
- Stars: 32
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# LLM-Fusion-API
## Supported API
- [OpenAI](https://platform.openai.com/docs/api-reference/introduction)
- [Wenxin](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/flfmc9do2)
- [FastChat](https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md)
- [MiniMax](https://api.minimax.chat/)
- [Zhipu](https://open.bigmodel.cn/doc/api#overview)
- Also known as ChatGLM.## OpenAI API Compatibility
### Chat Completion
| API | system message | function | stream | temperature | top_p | n | stop | max_tokens | presence_penalty | frequency_penalty | logit_bias |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| OpenAI | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
| Wenxin | ❌* |❌ | ✔️ | ✔️ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
| FastChat | ✔️ | ❌ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ❌ | ❌ | ❌ |
| MiniMax | ✔️ | ❌ | ✔️ | ✔️ | ✔️ | ✔️ | ❌ | ✔️ | ❌ | ❌ | ❌ |
| Zhipu | ❌ | ❌ | ✔️ | ✔️ | ✔️ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |* System messages will be converted into user/assistant message pairs.
### Emebeddings
| API | model | max_tokens |
| --- | --- | --- |
| OpenAI | text-embedding-ada-002 | 8191 |
| Wenxin | embedding-v1 | 384* |- If the input is longer than 384 tokens, it will be truncated.
## Running the API
```bash
python3 -m venv .venv
source .venv/bin/activatepip install -r requirements.txt
cp .env.example .env
# edit .env to set OPENAI_API_KEY etc.# run the API
./.venv/bin/uvicorn llm_fusion_api:app --reload
```## Deploy
Make docker image
```bash
docker build -t ninehills/llm-fusion-api:latest .
```### Test the API
```txt
$ curl localhost:8000/v1/models 2>/dev/null| jq ".data[].id"
"gpt-3.5-turbo-16k-0613"
"gpt-3.5-turbo-0301"
"gpt-3.5-turbo-16k"
"gpt-4-0613"
"gpt-4-0314"
"text-embedding-ada-002"
"gpt-4"
"gpt-3.5-turbo-0613"
"gpt-3.5-turbo"
"wenxin/ernie-bot"
"wenxin/ernie-bot-turbo"
"wenxin/bloomz_7b1"
"wenxin/embedding-v1"
"minimax/abab5.5-chat"
"minimax/embo-01"$ curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer xxx" \
-d '{ "stream": true,
"model": "minimax/abab5.5-chat",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
}'
data: {"id": "abe9437a622c413abd157605efb6e228", "object": "chat.completion.chunk", "created": 1690690250, "model": "abab5.5-chat", "choices": [{"index": 0, "delta": {"role": "assistant", "content": ""}, "finish_reason": null}]}data: {"id": "abe9437a622c413abd157605efb6e228", "object": "chat.completion.chunk", "created": 1690690250, "model": "abab5.5-chat", "choices": [{"index": 0, "delta": {"content": "Hello! How can I assist you today?"}, "finish_reason": null}]}
data: {"id": "abe9437a622c413abd157605efb6e228", "object": "chat.completion.chunk", "created": 1690690250, "model": "abab5.5-chat", "choices": [{"index": 0, "delta": {"content": ""}, "finish_reason": "stop"}]}
data: [DONE]
$ curl http://localhost:8000/v1/embeddings \
-H "Authorization: Bearer xxxx" \
-H "Content-Type: application/json" \
-d '{
"input": "The food was delicious and the waiter...",
"model": "text-embedding-ada-002"
}'$ curl http://localhost:8000/v1/engines/text-embedding-ada-002/embeddings \
-H "Authorization: Bearer xxxx" \
-H "Content-Type: application/json" \
-d '{
"input": "The food was delicious and the waiter..."
}'```