{"id":50449476,"url":"https://github.com/aktagon/llmkit-python","last_synced_at":"2026-05-31T23:32:05.153Z","repository":{"id":356098221,"uuid":"1230293141","full_name":"aktagon/llmkit-python","owner":"aktagon","description":"Unified LLM client library for Python - one API, 27 providers (Anthropic, OpenAI, Google Gemini, AWS Bedrock, Mistral, Groq, DeepSeek, +20 more), zero external dependencies (stdlib only).","archived":false,"fork":false,"pushed_at":"2026-05-14T07:29:10.000Z","size":225,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-05-14T09:40:03.971Z","etag":null,"topics":["agents","ai","ai-sdk","anthropic","bedrock","claude","gemini","gpt","groq","llm","llm-client","mistral","openai","python","streaming","tool-calling"],"latest_commit_sha":null,"homepage":"https://llmkit.aktagon.com","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aktagon.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-05T21:28:11.000Z","updated_at":"2026-05-14T07:29:18.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/aktagon/llmkit-python","commit_stats":null,"previous_names":["aktagon/llmkit-python"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/aktagon/llmkit-python","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aktagon%2Fllmkit-python","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aktagon%2Fllmkit-python/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aktagon%2Fllmkit-python/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aktagon%2Fllmkit-python/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aktagon","download_url":"https://codeload.github.com/aktagon/llmkit-python/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aktagon%2Fllmkit-python/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33753923,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-31T02:00:06.040Z","response_time":95,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agents","ai","ai-sdk","anthropic","bedrock","claude","gemini","gpt","groq","llm","llm-client","mistral","openai","python","streaming","tool-calling"],"created_at":"2026-05-31T23:32:04.123Z","updated_at":"2026-05-31T23:32:05.147Z","avatar_url":"https://github.com/aktagon.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# llmkit (Python)\n\nOne Python API for Anthropic, OpenAI, Google, and 20+ other providers — including local models through Ollama and vLLM. Switch providers without rewriting your request.\n\nAsync. Zero external dependencies — stdlib only, no `httpx`, no `pydantic`. Python 3.10+.\n\nAlso available for Go, TypeScript, and Rust.\n\n## Install\n\n```bash\npip install llmkit\n# or with uv:\nuv add llmkit\n```\n\nPython 3.10 or later.\n\n## Quick Start\n\n```python\nimport os\nimport asyncio\nfrom llmkit.builders import anthropic\n\nasync def main():\n    c = anthropic(os.environ[\"ANTHROPIC_API_KEY\"])\n    resp = await (\n        c.text\n        .system(\"Be concise.\")\n        .temperature(0.3)\n        .prompt(\"Say hi\")\n    )\n    print(resp.text)\n    print(resp.tokens.input, \"input tokens\")\n\nasyncio.run(main())\n```\n\nThe typed builder is the only public surface as of v1.0.0. One mental model — `client.\u003ccapability\u003e.\u003cchain\u003e.\u003cterminal\u003e` — across every capability.\n\nRunnable counterparts to every code block below live in [`examples/`](./examples/) and are exercised by `tests/test_examples.py` against a mock HTTP server, so the call shapes shown here are guaranteed to execute against the real builder surface.\n\n## Providers\n\nPer-provider factory functions:\n\n```\nai21       anthropic  azure      bedrock    cerebras   cohere\ndeepseek   doubao     ernie      fireworks  google     grok\ngroq       jan        llamacpp   lmstudio   minimax    mistral\nmoonshot   ollama     openai     openrouter perplexity qwen\nsambanova  together   vertex     vllm       yi         zhipu\n```\n\nOr use the generic `new_client(name, api_key)`. 30 providers, 4 API shapes (OpenAI-compatible, Anthropic Messages, Google Generative AI, AWS Bedrock Converse). Bedrock auth uses SigV4; other providers use API-key auth.\n\n## API\n\n### Text — one-shot prompt\n\n```python\nresp = await (\n    c.text\n    .system(\"You are helpful\")\n    .temperature(0.7)\n    .max_tokens(200)\n    .prompt(\"What is 2+2?\")\n)\n\nprint(resp.text)               # \"4\"\nprint(resp.tokens.input)       # prompt tokens\nprint(resp.tokens.output)      # completion tokens\nprint(resp.tokens.cache_read)  # tokens served from cache\nprint(resp.tokens.cache_write) # tokens written to cache (Anthropic explicit)\nprint(resp.tokens.reasoning)   # internal reasoning tokens (OpenAI o-series, Gemini 2.5+)\n```\n\nCapability-scoped fields (`cache_read`, `cache_write`, `reasoning`) are zero when the provider doesn't report them separately.\n\n### Stream — async iteration with trailing handle\n\n```python\nstream = c.text.system(\"Be brief\").stream(\"Tell me a joke\")\nasync for chunk in stream:\n    print(chunk, end=\"\", flush=True)\nprint(\"\\nUsage:\", stream.response.tokens)\n```\n\n`TextStream` implements `__aiter__`. After iteration completes, the `stream.response` property carries the final `Response` (with token counts) and `stream.error` carries any terminal error. Handles both Anthropic-style typed events and OpenAI-style data-only frames internally.\n\n### Agent — tool loop\n\n```python\nfrom llmkit import Tool\n\ndef add(args):\n    return str(args[\"a\"] + args[\"b\"])\n\nadd_tool = Tool(\n    name=\"add\",\n    description=\"Add two numbers\",\n    schema={\n        \"type\": \"object\",\n        \"properties\": {\"a\": {\"type\": \"number\"}, \"b\": {\"type\": \"number\"}},\n    },\n    run=add,\n)\n\nbot = (\n    c.agent\n    .system(\"You are a calculator.\")\n    .add_tool(add_tool)\n    .max_tool_iterations(5)\n)\nresp = await bot.prompt(\"What is 2+3?\")\nprint(resp.text)\n```\n\n`*Agent` is **stateful** — repeated `bot.prompt(...)` calls accumulate history. Chain methods (`.system(...)`, `.add_tool(...)`) clone and reset state, so a forked builder gets a fresh conversation. `bot.reset()` clears state without dropping chained config.\n\nTool dispatch covers Anthropic `tool_use`, OpenAI `tool_calls`, Google `functionCall`, and Bedrock Converse `toolUse`.\n\n### Image — text-to-image and edit\n\nSupports Google's Nano Banana 2 (`gemini-3.1-flash-image-preview`) and Pro (`gemini-3-pro-image-preview`); OpenAI's `gpt-image-2`, `gpt-image-1.5`, `gpt-image-1`, and `gpt-image-1-mini`; xAI's `grok-imagine-image-quality`; Google Cloud Vertex AI's Imagen 3 / Imagen 4 (`imagen-3.0-generate-002`, `imagen-3.0-fast-generate-001`, `imagen-4.0-generate-preview-06-06`).\n\n```python\nfrom llmkit.builders import google\n\nc = google(os.environ[\"GOOGLE_API_KEY\"])\nimg = await (\n    c.image\n    .model(\"gemini-3.1-flash-image-preview\")\n    .aspect_ratio(\"16:9\")\n    .image_size(\"2K\")\n    .generate(\"A nano banana dish, studio lighting\")\n)\n\nwith open(\"out.png\", \"wb\") as f:\n    f.write(img.images[0].bytes)\n```\n\nFor compositional editing, chain `.text(...)` and `.image(mime, bytes)` to interleave references with descriptions:\n\n```python\nawait (\n    c.image\n    .model(\"gemini-3.1-flash-image-preview\")\n    .text(\"Person:\")\n    .image(\"image/png\", person_bytes)\n    .text(\"Outfit:\")\n    .image(\"image/png\", outfit_bytes)\n    .generate(\"Generate the person wearing the outfit.\")\n)\n```\n\nAspect ratios and sizes validate against a per-model whitelist before the HTTP request. Empty whitelists mean \"no client-side check; pass through\" — providers like OpenAI accept arbitrary sizes within documented bounds (max edge ≤3840, both edges multiples of 16, ratio ≤3:1, total pixels 655K–8.3M), so the SDK trusts the API boundary instead of carrying a stale list.\n\nFor OpenAI, the chain dispatches automatically — no image parts hits `/v1/images/generations` (JSON), one or more image parts hits `/v1/images/edits` (multipart/form-data with one `image[]` field per reference, in caller order).\n\nProvider knobs are typed chain methods:\n\n| Method               | Provider support            | Wire field       |\n| -------------------- | --------------------------- | ---------------- |\n| `.quality(s)`        | OpenAI gpt-image-\\*         | `quality`        |\n| `.output_format(s)`  | OpenAI gpt-image-\\*         | `output_format`  |\n| `.background(s)`     | OpenAI gpt-image-\\*         | `background`     |\n| `.count(n)`          | OpenAI + xAI Grok           | `n`              |\n| `.mask(mime, bytes)` | OpenAI gpt-image-\\* (edits) | multipart `mask` |\n\nThe chain validates per provider — calling `.quality(...)` on a Google or xAI builder raises `ValidationError` immediately, no HTTP round-trip. Knobs without typed methods (OpenAI: `output_compression`, `moderation`) remain reachable via `.extra_fields(...)`, which is unvalidated and freeform.\n\n```python\nfrom llmkit.builders import openai\n\nc = openai(os.environ[\"OPENAI_API_KEY\"])\nresp = await (\n    c.image\n    .model(\"gpt-image-2\")\n    .image_size(\"1024x1024\")\n    .quality(\"high\")\n    .count(4)\n    .generate(\"A red circle on a white background\")\n)\n```\n\nOpenAI gpt-image-\\* models require organization verification — see [platform.openai.com/docs/guides/your-data#organization-verification](https://platform.openai.com/docs/guides/your-data#organization-verification).\n\nUp to 14 reference images per Google request, 16 per OpenAI request.\n\n#### Vertex AI Imagen (Google Cloud)\n\nVertex Imagen uses the `:predict` endpoint family and OAuth bearer auth instead of API keys. The SDK takes a bearer token (string); caller manages OAuth refresh externally (e.g. `gcloud auth print-access-token`, service-account JSON, or workload identity).\n\n```python\nimport os\nfrom llmkit.builders import vertex\n\n# Caller substitutes {project_id} and {location} before passing the URL.\nbase_url = (\n    \"https://us-central1-aiplatform.googleapis.com\"\n    \"/v1/projects/my-gcp-project/locations/us-central1/publishers/google/models\"\n)\n\nc = vertex(os.environ[\"VERTEX_BEARER_TOKEN\"]).with_base_url(base_url)\n\nresp = await (\n    c.image\n    .model(\"imagen-3.0-generate-002\")\n    .aspect_ratio(\"16:9\")\n    .count(2)\n    .generate(\"A red circle\")\n)\n```\n\nEdit-mode (single image into `instances[0].image`) and inpainting (`.mask(mime, bytes)` into `instances[0].mask.image`) work the same way. Imagen-specific knobs like `negativePrompt` and `safetySetting` are reachable through `.extra_fields(...)` — they spread into the request's `parameters` block. Vertex's `:predict` response does not carry token counts; `resp.tokens` stays zero.\n\n### Safety Settings\n\nControl content filtering for Gemini providers. `safety_settings` applies to text\ngeneration, streaming, agents, and Gemini image generation. `safety_filter` applies\nto Vertex Imagen only.\n\n```python\nfrom llmkit.builders import google, vertex\nfrom llmkit.types import (\n    SafetySetting,\n    HARM_CATEGORY_DANGEROUS_CONTENT,\n    HARM_CATEGORY_HARASSMENT,\n    HARM_BLOCK_THRESHOLD_NONE,\n    HARM_BLOCK_THRESHOLD_HIGH_ONLY,\n    IMAGE_SAFETY_FILTER_BLOCK_FEW,\n)\n\n# Gemini text or agent\nc = google(os.environ[\"GOOGLE_API_KEY\"])\nresp = await (\n    c.text\n    .safety_settings([\n        SafetySetting(category=HARM_CATEGORY_DANGEROUS_CONTENT, threshold=HARM_BLOCK_THRESHOLD_NONE),\n        SafetySetting(category=HARM_CATEGORY_HARASSMENT, threshold=HARM_BLOCK_THRESHOLD_HIGH_ONLY),\n    ])\n    .prompt(\"Write a story\")\n)\n\n# Vertex Imagen\nvc = vertex(os.environ[\"VERTEX_BEARER_TOKEN\"])\nimg = await (\n    vc.image\n    .model(\"imagen-3.0-generate-002\")\n    .safety_filter(IMAGE_SAFETY_FILTER_BLOCK_FEW)\n    .generate(\"A landscape\")\n)\n```\n\n`safety_settings` on Vertex Imagen and `safety_filter` on non-Imagen providers raise\na `ValidationError`. The `HARM_CATEGORY_*`, `HARM_BLOCK_THRESHOLD_*`, and\n`IMAGE_SAFETY_FILTER_*` constants cover all documented values; raw strings also work.\n\n### Upload — Path or Bytes\n\n```python\nfrom llmkit.builders import openai\n\nc = openai(os.environ[\"OPENAI_API_KEY\"])\n\n# from a path\nfile = await c.upload.path(\"./data.pdf\").run()\n\n# from bytes (filename required)\nfile2 = await (\n    c.upload\n    .bytes(buf)\n    .filename(\"report.pdf\")\n    .mime_type(\"application/pdf\")\n    .run()\n)\n```\n\n### Batches\n\n```python\nresults = await (\n    c.text\n    .system(\"Be brief\")\n    .batch([\"Translate hello to French\", \"Translate hello to Spanish\"])\n)\nfor r in results:\n    print(r.text)\n```\n\n`.batch(prompts)` is `.submit_batch(prompts)` + `handle.wait()`. Use `.submit_batch(prompts)` to get a `BatchHandle` you can persist, then call `await handle.wait()` later. Both inline (Anthropic) and file-reference (OpenAI two-hop) flows are handled internally.\n\n### Caching\n\n```python\n# Anthropic — explicit cache_control wrap of the system prompt:\nawait c.text.system(long_sys_prompt).caching().prompt(\"...\")\n\n# OpenAI — automatic server-side caching (caching() is a hint; reads\n# surface in resp.tokens.cache_read regardless):\nawait c.text.system(long_sys_prompt).caching().prompt(\"...\")\n\n# Google — pre-flight POST creates a cachedContents resource, then the\n# main call references it. Google requires ~1k+ tokens of system prompt:\nawait c.text.system(big_sys_prompt).caching().prompt(\"...\")\n```\n\nThe mode is provider-specific and inferred from the provider config. The default TTL for Google is 3600s.\n\n### Model catalogue\n\n`c.models` and `c.providers` cover model discovery in three modes. Runnable counterpart at [`examples/catalogue.py`](./examples/catalogue.py).\n\n```python\nfrom llmkit import Provider\nfrom llmkit.types import Capability\n\n# 1. Compiled-in catalogue — synchronous, no HTTP.\nall_models = c.models.list()\ninfo = c.models.get(\"claude-opus-4-7\")            # ModelInfo | None\nchat = c.models.with_capability(Capability.CHAT_COMPLETION).list()\n\n# 2. Providers namespace.\nc.providers.list()        # configured (credentials + /v1/models endpoint)\nc.providers.supported()   # every provider the SDK was built with\n\n# 3. Live + scoped HTTP.\nlive = await c.models.live()                       # LiveResult — fan-out\np = Provider(name=\"anthropic\", api_key=\"sk-...\")\nscoped = await c.models.provider(p).list()         # single-provider list\nraw = await c.models.provider(p).raw().list()      # ModelInfo.raw populated\n```\n\n`live()` calls every configured provider's `/v1/models` in parallel and aggregates results into `LiveResult.models` + a per-provider `LiveResult.errors` map (partial success is the normal case). `provider(p).raw().list()` opts into populating `ModelInfo.raw` with the provider-native record — useful when you need fields the universal `ModelInfo` does not carry (Anthropic's capability matrix, Google's `supportedGenerationMethods`, etc.).\n\n## Options\n\nAcross every `*Text` / `*Agent` builder:\n\n| Concept           | Method                 |\n| ----------------- | ---------------------- |\n| System prompt     | `.system(s)`           |\n| Model override    | `.model(name)`         |\n| Sampling          | `.temperature(t)`      |\n| Token cap         | `.max_tokens(n)`       |\n| Caching           | `.caching()`           |\n| Structured output | `.schema(json)`        |\n| Middleware hooks  | `.add_middleware(fns)` |\n| Reasoning effort  | `.reasoning_effort(l)` |\n| Thinking budget   | `.thinking_budget(n)`  |\n\n`*Text` additionally exposes `.history(*msgs)` for stateless multi-turn replay. `*Agent` is stateful instead — history accumulates across `.prompt(...)` calls on the same builder instance and resets when a chain method clones the builder or `.reset()` is called. Cross-process resume of an `*Agent` is not supported via a builder method today.\n\nSampling hyperparameters (`.top_p`, `.top_k`, `.seed`, `.frequency_penalty`, `.presence_penalty`, `.stop_sequences`) are validated per provider; unsupported options raise `ValidationError` rather than silently dropping.\n\nThe Image builder has a narrower set: `.model`, `.aspect_ratio`, `.image_size`, `.include_text`, `.text`, `.image`, `.middleware`. Upload: `.path`, `.bytes`, `.filename`, `.mime_type`, `.middleware`.\n\n## Middleware\n\n```python\nfrom llmkit import Event, MiddlewareFn\n\ndef log_usage(e):\n    if e.op == \"llm_request\" and e.phase == \"post\":\n        print(f\"{e.provider}/{e.model}: {e.usage.input} in, {e.usage.output} out\")\n    return None\n\nawait c.text.add_middleware([log_usage]).prompt(\"...\")\n```\n\nPre-phase middleware can veto by returning a non-None error message; post-phase runs for observation only. Wired at six sites: text prompt, text stream, agent LLM call, agent tool execution, upload, batch submit, Google resource caching pre-flight.\n\n## Self-hosted endpoints\n\n```python\nfrom llmkit.builders import openai\n\nc = openai(\"anything\").with_base_url(\"http://localhost:8080/v1\")\n```\n\nWorks for any OpenAI-compatible server (vLLM, LM Studio, Ollama, corporate gateways).\n\n## Wire-format stability\n\n`*Agent` history persists across process boundaries through two paired\nfunctions:\n\n```python\ndata = bot.save()                             # bytes\n# ...later, fresh process...\nbot = c.agent.system(\"...\").tool(t).load(data)\n# raises UnsupportedWireVersionError on mismatch\n```\n\nOr the free-function form for admin tooling:\n\n```python\nfrom llmkit import save_history, load_history\n\ndata = save_history(msgs)\nmsgs = load_history(data)\n```\n\nThe output is a JSON document with a `_v` integer envelope plus a\n`messages` array. The version is tracked through\n`WIRE_SCHEMA_VERSION`; the in-memory `Message` schema may evolve\nadditively under one version (new optional fields work on older\nreaders), but a renamed, removed, or retyped field requires a `_v`\nbump and a migrator.\n\n`save_history` / `load_history` are the ONLY guaranteed-stable\nserialization path. Direct `json.dumps` / `dataclasses.asdict` on a\n`Message` produces valid JSON but lacks the `_v` envelope, and\n`load_history` rejects it with `MissingWireVersionError`. Use the\ncontract path for anything that crosses a process boundary or a\nrelease.\n\n## Mirror\n\nThis repo is a read-only mirror of a private monorepo. File issues here; code patches should target the private source via `christian@aktagon.com`.\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faktagon%2Fllmkit-python","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faktagon%2Fllmkit-python","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faktagon%2Fllmkit-python/lists"}