Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/community-of-python/any-llm-client
Access LLMs via lightweight unified API
https://github.com/community-of-python/any-llm-client
Last synced: 23 days ago
JSON representation
Access LLMs via lightweight unified API
- Host: GitHub
- URL: https://github.com/community-of-python/any-llm-client
- Owner: community-of-python
- License: mit
- Created: 2024-11-20T10:36:50.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2024-12-05T15:38:51.000Z (about 1 month ago)
- Last Synced: 2024-12-05T16:32:28.904Z (about 1 month ago)
- Language: Python
- Homepage:
- Size: 90.8 KB
- Stars: 5
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# any-llm-client
A unified and lightweight asynchronous Python API for communicating with LLMs.
Supports multiple providers, including OpenAI Chat Completions API (and any OpenAI-compatible API, such as Ollama and vLLM) and YandexGPT API.
## How To Use
Before starting using any-llm-client, make sure you have it installed:
```sh
uv add any-llm-client
poetry add any-llm-client
```### Response API
Here's a full example that uses Ollama and Qwen2.5-Coder:
```python
import asyncioimport any_llm_client
config = any_llm_client.OpenAIConfig(url="http://127.0.0.1:11434/v1/chat/completions", model_name="qwen2.5-coder:1.5b")
async def main() -> None:
async with any_llm_client.get_client(config) as client:
print(await client.request_llm_message("Кек, чо как вообще на нарах?"))asyncio.run(main())
```To use `YandexGPT`, replace the config:
```python
config = any_llm_client.YandexGPTConfig(
auth_header=os.environ["YANDEX_AUTH_HEADER"], folder_id=os.environ["YANDEX_FOLDER_ID"], model_name="yandexgpt"
)
```### Streaming API
LLMs often take long time to respond fully. Here's an example of streaming API usage:
```python
import asyncioimport any_llm_client
config = any_llm_client.OpenAIConfig(url="http://127.0.0.1:11434/v1/chat/completions", model_name="qwen2.5-coder:1.5b")
async def main() -> None:
async with (
any_llm_client.get_client(config) as client,
client.stream_llm_message_chunks("Кек, чо как вообще на нарах?") as message_chunks,
):
async for chunk in message_chunks:
print(chunk, end="", flush=True)asyncio.run(main())
```### Passing chat history and temperature
You can pass list of messages instead of `str` as the first argument, and set `temperature`:
```python
async with (
any_llm_client.get_client(config) as client,
client.stream_llm_message_chunks(
messages=[
any_llm_client.SystemMessage("Ты — опытный ассистент"),
any_llm_client.UserMessage("Кек, чо как вообще на нарах?"),
],
temperature=1.0,
) as message_chunks,
):
...
```### Other
#### Mock client
You can use a mock client for testing:
```python
config = any_llm_client.MockLLMConfig(
response_message=...,
stream_messages=["Hi!"],
)async with any_llm_client.get_client(config, ...) as client:
...
```#### Configuration with environment variables
##### Credentials
Instead of passing credentials directly, you can set corresponding environment variables:
- OpenAI: `ANY_LLM_CLIENT_OPENAI_AUTH_TOKEN`,
- YandexGPT: `ANY_LLM_CLIENT_YANDEXGPT_AUTH_HEADER`, `ANY_LLM_CLIENT_YANDEXGPT_FOLDER_ID`.##### LLM model config (with [pydantic-settings](https://docs.pydantic.dev/latest/concepts/pydantic_settings/))
```python
import osimport pydantic_settings
import any_llm_client
class Settings(pydantic_settings.BaseSettings):
llm_model: any_llm_client.AnyLLMConfigos.environ["LLM_MODEL"] = """{
"api_type": "openai",
"url": "http://127.0.0.1:11434/v1/chat/completions",
"model_name": "qwen2.5-coder:1.5b"
}"""
settings = Settings()async with any_llm_client.get_client(settings.llm_model, ...) as client:
...
```Combining with environment variables from previous section, you can keep LLM model configuration and secrets separate.
#### Using clients directly
The recommended way to get LLM client is to call `any_llm_client.get_client()`. This way you can easily swap LLM models. If you prefer, you can use `any_llm_client.OpenAIClient` or `any_llm_client.YandexGPTClient` directly:
```python
config = any_llm_client.OpenAIConfig(
url=pydantic.HttpUrl("https://api.openai.com/v1/chat/completions"),
auth_token=os.environ["OPENAI_API_KEY"],
model_name="gpt-4o-mini",
)async with any_llm_client.OpenAIClient(config, ...) as client:
...
```#### Errors
`any_llm_client.LLMClient.request_llm_message()` and `any_llm_client.LLMClient.stream_llm_message_chunks()` will raise `any_llm_client.LLMError` or `any_llm_client.OutOfTokensOrSymbolsError` when the LLM API responds with a failed HTTP status.
#### Timeouts, proxy & other HTTP settings
Pass custom [HTTPX](https://www.python-httpx.org) kwargs to `any_llm_client.get_client()`:
```python
import httpximport any_llm_client
async with any_llm_client.get_client(
...,
mounts={"https://api.openai.com": httpx.AsyncHTTPTransport(proxy="http://localhost:8030")},
timeout=httpx.Timeout(None, connect=5.0),
) as client:
...
```Default timeout is `httpx.Timeout(None, connect=5.0)` (5 seconds on connect, unlimited on read, write or pool).
#### Retries
By default, requests are retried 3 times on HTTP status errors. You can change the retry behaviour by supplying `request_retry` parameter:
```python
async with any_llm_client.get_client(..., request_retry=any_llm_client.RequestRetryConfig(attempts=5, ...)) as client:
...
```#### Passing extra data to LLM
```python
await client.request_llm_message("Кек, чо как вообще на нарах?", extra={"best_of": 3})
```