https://github.com/lroolle/one-llm-api

Cloudflare worker.js that proxy Claude/AzureOpenAI to OpenAI ChatCompletions API.
https://github.com/lroolle/one-llm-api

Last synced: 9 days ago
JSON representation

Cloudflare worker.js that proxy Claude/AzureOpenAI to OpenAI ChatCompletions API.

Host: GitHub
URL: https://github.com/lroolle/one-llm-api
Owner: lroolle
Created: 2023-07-13T00:02:29.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2023-09-15T15:27:15.000Z (about 2 years ago)
Last Synced: 2025-01-15T05:59:14.453Z (9 months ago)
Language: TypeScript
Size: 111 KB
Stars: 8
Watchers: 2
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.org

Awesome Lists containing this project

README

* ONE-LLM-API

This is a Cloudflare worker that acts as an adapter to proxy requests to the
[[https://platform.openai.com/docs/guides/gpt/chat-completions-api][OpenAI Chat Completions API]] to other AI services like Claude, Azure OpenAI, and
Google Palm.

This project draws inspiration from, and is based on, the work of
[[https://github.com/haibbo/cf-openai-azure-proxy][haibbo/cf-openai-azure-proxy]]. If you only require a single service proxy
(such as Claude/Azure), please consider using the original project.
This project, however, is more complex and is specifically designed to
accommodate multiple services.

** Quick Start

#+begin_src sh :exports both :wrap src sh :results raw replace
curl TODO
#+end_src

** Overview

The worker allows you to make requests to the OpenAI API format and seamlessly
redirect them under the hood to other services. This provides a unified API
interface and abstraction layer across multiple AI providers.

*** Key Features

- *Single endpoint to access multiple AI services*;
- Chat to *Multiple models* in a single request;
- Unified request/response format using the OpenAI API;
- Handles streaming responses;
- Single API key for authentication;
- Support for OpenAI, Claude, Azure OpenAI, and Google Palm;
- Support multiple resource config for Azure OpenAI;

**** Features may be added in the future
1. Multiple "ONE_API_KEY" support;
2. Logging request by keys, pricing and token count;
3. Throttling;

** Usages

To use the adapter, simply make requests to the worker endpoint with the OpenAI
JSON request payload.

Behind the scenes the worker will:

- Route requests to the appropriate backend based on the `model` specified
- Transform request payload to the destination API format
- Proxy the request and response
- Convert responses back to OpenAI format

*** Request Example

For example, to use =gpt-3.5-turbo=:

#+begin_src json :exports both
{
"model": "gpt-3.5-turbo",
"stream": true,
"messages": [
{
"role": "user",
"content": "Hello there!"
}
]
}
#+end_src

To use =claude-2=:

#+begin_src json :exports both
{
"model": "claude-2",
"stream": true,
"messages": [...]
}
#+end_src

You can specify *multiple models* (delimitered by ~,~) to query in parallel:

#+begin_src json :exports both
{
"model": "gpt-3.5-turbo,claude-2",
"stream": true,
"messages": [...]
}
#+end_src

The response will contain the concatenated output from both models streamed in
the OpenAI API format.

Other OpenAI parameters like `temperature`, `stream`, `stop` etc. can also be
specified normally.

*** Python Example

#+begin_src python :exports both :results output
import openai

openai.api_key = ""
openai.api_base = "/v1"

# For example, the local wrangler development endpoint
# openai.api_key = 'sk-fakekey'
# openai.api_base = "http://127.0.0.1:8787/v1"

chat_completion = openai.ChatCompletion.create(
model="gpt-4,claude-2",
messages=[
{
"role": "user",
"content": "A brief introduction about yourself and say hello!",
}
],
stream=True,
)

for chunk in chat_completion:
if chunk["choices"]:
print(chunk["model"], chunk["choices"][0]["delta"].get("content", ""))
#+end_src

** The API Services supported [2/4]

*** [X] OpenAI
CLOSED: [2023-07-18 Tue 21:08]
*** [X] Azure OpenAI
CLOSED: [2023-07-18 Tue 21:09]
*** [ ] Claude
CLOSED: [2023-07-18 Tue 21:09]
*** [ ] Google Palm
CLOSED: [2023-07-18 Tue 21:09]

** The /models/ suported

Here are the models currently supported by the adapter service:

To use a particular model, specify its ID in the `model` field of the request body.

*** OpenAI Models

All the chat models available by your OPENAI_API_KEY

*** Azure OpenAI Models

Based on your deployment name, you will have to set the environment variable
~AZURE_OPENAI_API_KEY~ to the corresponding API key.

You can also setup multiple deployments with different API keys to access
different models.

// TODO:

*** TODO Claude Models
:LOGBOOK:
- State "TODO" from [2023-09-04 Mon 23:24]
:END:

- claude-instant-1(claude-instant-1.2)
- claude-2(claude-2.0)

*** TODO Google Palm Models
:LOGBOOK:
- State "TODO" from [2023-09-04 Mon 23:24]
:END:

- text-bison-001
- chat-bison-001

** Deployment

[[https://deploy.workers.cloudflare.com/?url=https://github.com/lroolle/one-llm-api][Deploy to Cloudflare Workers]]

To deploy, you will need:

- Cloudflare account
- API keys for each service

*** Install wrangler

#+begin_src sh :exports both :wrap src sh :results raw replace
npm i wrangler -g
#+end_src

*** KV create
#+begin_src sh :exports both :wrap src sh :results raw replace
wrangler kv:namespace create ONELLM_KV

# if you need to test in the local wrangler dev
wrangler kv:namespace create ONELLM_KV --preview
#+end_src

*** Environment Variables

Configure the worker environment variables with your secret keys.

Skip the service key if you do not have one or you do not want to deploy it.

#+begin_src sh :exports both :wrap src sh :results raw replace
wrangler secret put ONE_API_KEY
wrangler secret put OPENAI_API_KEY
wrangler secret put AZURE_OPENAI_API_KEYS
wrangler secret put ANTHROPIC_API_KEY
wrangler secret put PALM_API_KEY
#+end_src

Or you can add the keys after deploy using the Cloudflare dashboard.

#+begin_quote
Worker -> Settings -> Variables -> Environment Variables
#+end_quote

*** Run publish/deploy

#+begin_src sh :exports both :wrap src sh :results raw replace
wrangler depoly
#+end_src

** Development

Create a ~.dev.vars~ with your environment API_KEYs, then run:

#+begin_src sh :exports both :wrap src sh :results raw replace
wrangler dev
#+end_src

#+begin_src sh :exports both :wrap src sh :results raw replace
curl -vvv http://127.0.0.1:8787/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer sk-fakekey" -d '{
"model": "gpt-3.5-turbo,claude-2", "stream": true,
"messages": [{"role": "user", "content": "Say: Hello I am your helpful one Assistant."}]
}'
#+end_src

** Contributions

Contributions and improvements are welcome! Please open GitHub issues or PRs.

Let me know if you would like any changes or have additional sections to add!

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/lroolle/one-llm-api

Awesome Lists containing this project

README