https://github.com/lroolle/one-llm-api
Cloudflare worker.js that proxy Claude/AzureOpenAI to OpenAI ChatCompletions API.
https://github.com/lroolle/one-llm-api
Last synced: 9 days ago
JSON representation
Cloudflare worker.js that proxy Claude/AzureOpenAI to OpenAI ChatCompletions API.
- Host: GitHub
- URL: https://github.com/lroolle/one-llm-api
- Owner: lroolle
- Created: 2023-07-13T00:02:29.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2023-09-15T15:27:15.000Z (about 2 years ago)
- Last Synced: 2025-01-15T05:59:14.453Z (9 months ago)
- Language: TypeScript
- Size: 111 KB
- Stars: 8
- Watchers: 2
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.org
Awesome Lists containing this project
README
* ONE-LLM-API
This is a Cloudflare worker that acts as an adapter to proxy requests to the
[[https://platform.openai.com/docs/guides/gpt/chat-completions-api][OpenAI Chat Completions API]] to other AI services like Claude, Azure OpenAI, and
Google Palm.This project draws inspiration from, and is based on, the work of
[[https://github.com/haibbo/cf-openai-azure-proxy][haibbo/cf-openai-azure-proxy]]. If you only require a single service proxy
(such as Claude/Azure), please consider using the original project.
This project, however, is more complex and is specifically designed to
accommodate multiple services.** Quick Start
#+begin_src sh :exports both :wrap src sh :results raw replace
curl TODO
#+end_src** Overview
The worker allows you to make requests to the OpenAI API format and seamlessly
redirect them under the hood to other services. This provides a unified API
interface and abstraction layer across multiple AI providers.*** Key Features
- *Single endpoint to access multiple AI services*;
- Chat to *Multiple models* in a single request;
- Unified request/response format using the OpenAI API;
- Handles streaming responses;
- Single API key for authentication;
- Support for OpenAI, Claude, Azure OpenAI, and Google Palm;
- Support multiple resource config for Azure OpenAI;**** Features may be added in the future
1. Multiple "ONE_API_KEY" support;
2. Logging request by keys, pricing and token count;
3. Throttling;** Usages
To use the adapter, simply make requests to the worker endpoint with the OpenAI
JSON request payload.Behind the scenes the worker will:
- Route requests to the appropriate backend based on the `model` specified
- Transform request payload to the destination API format
- Proxy the request and response
- Convert responses back to OpenAI format*** Request Example
For example, to use =gpt-3.5-turbo=:
#+begin_src json :exports both
{
"model": "gpt-3.5-turbo",
"stream": true,
"messages": [
{
"role": "user",
"content": "Hello there!"
}
]
}
#+end_srcTo use =claude-2=:
#+begin_src json :exports both
{
"model": "claude-2",
"stream": true,
"messages": [...]
}
#+end_srcYou can specify *multiple models* (delimitered by ~,~) to query in parallel:
#+begin_src json :exports both
{
"model": "gpt-3.5-turbo,claude-2",
"stream": true,
"messages": [...]
}
#+end_srcThe response will contain the concatenated output from both models streamed in
the OpenAI API format.Other OpenAI parameters like `temperature`, `stream`, `stop` etc. can also be
specified normally.*** Python Example
#+begin_src python :exports both :results output
import openaiopenai.api_key = ""
openai.api_base = "/v1"# For example, the local wrangler development endpoint
# openai.api_key = 'sk-fakekey'
# openai.api_base = "http://127.0.0.1:8787/v1"chat_completion = openai.ChatCompletion.create(
model="gpt-4,claude-2",
messages=[
{
"role": "user",
"content": "A brief introduction about yourself and say hello!",
}
],
stream=True,
)for chunk in chat_completion:
if chunk["choices"]:
print(chunk["model"], chunk["choices"][0]["delta"].get("content", ""))
#+end_src** The API Services supported [2/4]
*** [X] OpenAI
CLOSED: [2023-07-18 Tue 21:08]
*** [X] Azure OpenAI
CLOSED: [2023-07-18 Tue 21:09]
*** [ ] Claude
CLOSED: [2023-07-18 Tue 21:09]
*** [ ] Google Palm
CLOSED: [2023-07-18 Tue 21:09]** The /models/ suported
Here are the models currently supported by the adapter service:
To use a particular model, specify its ID in the `model` field of the request body.
*** OpenAI Models
All the chat models available by your OPENAI_API_KEY
*** Azure OpenAI Models
Based on your deployment name, you will have to set the environment variable
~AZURE_OPENAI_API_KEY~ to the corresponding API key.You can also setup multiple deployments with different API keys to access
different models.// TODO:
*** TODO Claude Models
:LOGBOOK:
- State "TODO" from [2023-09-04 Mon 23:24]
:END:- claude-instant-1(claude-instant-1.2)
- claude-2(claude-2.0)*** TODO Google Palm Models
:LOGBOOK:
- State "TODO" from [2023-09-04 Mon 23:24]
:END:- text-bison-001
- chat-bison-001** Deployment
[[https://deploy.workers.cloudflare.com/?url=https://github.com/lroolle/one-llm-api][Deploy to Cloudflare Workers]]
To deploy, you will need:
- Cloudflare account
- API keys for each service*** Install wrangler
#+begin_src sh :exports both :wrap src sh :results raw replace
npm i wrangler -g
#+end_src*** KV create
#+begin_src sh :exports both :wrap src sh :results raw replace
wrangler kv:namespace create ONELLM_KV# if you need to test in the local wrangler dev
wrangler kv:namespace create ONELLM_KV --preview
#+end_src*** Environment Variables
Configure the worker environment variables with your secret keys.
Skip the service key if you do not have one or you do not want to deploy it.
#+begin_src sh :exports both :wrap src sh :results raw replace
wrangler secret put ONE_API_KEY
wrangler secret put OPENAI_API_KEY
wrangler secret put AZURE_OPENAI_API_KEYS
wrangler secret put ANTHROPIC_API_KEY
wrangler secret put PALM_API_KEY
#+end_srcOr you can add the keys after deploy using the Cloudflare dashboard.
#+begin_quote
Worker -> Settings -> Variables -> Environment Variables
#+end_quote*** Run publish/deploy
#+begin_src sh :exports both :wrap src sh :results raw replace
wrangler depoly
#+end_src** Development
Create a ~.dev.vars~ with your environment API_KEYs, then run:
#+begin_src sh :exports both :wrap src sh :results raw replace
wrangler dev
#+end_src#+begin_src sh :exports both :wrap src sh :results raw replace
curl -vvv http://127.0.0.1:8787/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer sk-fakekey" -d '{
"model": "gpt-3.5-turbo,claude-2", "stream": true,
"messages": [{"role": "user", "content": "Say: Hello I am your helpful one Assistant."}]
}'
#+end_src** Contributions
Contributions and improvements are welcome! Please open GitHub issues or PRs.
Let me know if you would like any changes or have additional sections to add!