{"id":28530711,"url":"https://github.com/sethbang/venice-ai","last_synced_at":"2025-07-07T09:31:34.611Z","repository":{"id":296889879,"uuid":"994865665","full_name":"sethbang/venice-ai","owner":"sethbang","description":"🐍 Python client for Venice.ai. Seamlessly integrate powerful GenAI: chat 💬, image gen 🖼️, audio TTS 🔊, embeddings \u0026 more. Supports sync/async, streaming \u0026 robust error handling.","archived":false,"fork":false,"pushed_at":"2025-06-19T21:20:48.000Z","size":912,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-19T21:53:00.677Z","etag":null,"topics":["ai","api-client","asynchronous","client-libray","developer-tools","generative-ai","image-generation","llm","python","python-library","sdk","text-generation","tts","uncensored","venice-ai"],"latest_commit_sha":null,"homepage":"https://venice-ai.readthedocs.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sethbang.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-02T15:46:29.000Z","updated_at":"2025-06-19T21:20:12.000Z","dependencies_parsed_at":"2025-06-08T01:01:51.038Z","dependency_job_id":null,"html_url":"https://github.com/sethbang/venice-ai","commit_stats":null,"previous_names":["sethbang/venice-ai"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/sethbang/venice-ai","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sethbang%2Fvenice-ai","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sethbang%2Fvenice-ai/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sethbang%2Fvenice-ai/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sethbang%2Fvenice-ai/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sethbang","download_url":"https://codeload.github.com/sethbang/venice-ai/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sethbang%2Fvenice-ai/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261348769,"owners_count":23145318,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","api-client","asynchronous","client-libray","developer-tools","generative-ai","image-generation","llm","python","python-library","sdk","text-generation","tts","uncensored","venice-ai"],"created_at":"2025-06-09T14:37:22.718Z","updated_at":"2025-07-07T09:31:34.597Z","avatar_url":"https://github.com/sethbang.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Venice AI Python Client\n\n[![PyPI version](https://img.shields.io/pypi/v/venice-ai.svg)](https://pypi.org/project/venice-ai/)\n[![CI Status](https://github.com/sethbang/venice-ai/actions/workflows/python-publish.yaml/badge.svg)](https://github.com/sethbang/venice-ai/actions/workflows/python-publish.yaml)\n[![Coverage Status](https://img.shields.io/codecov/c/github/sethbang/venice-ai.svg)](https://codecov.io/gh/sethbang/venice-ai)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Python Versions](https://img.shields.io/pypi/pyversions/venice-ai.svg)](https://pypi.org/project/venice-ai/)\n\nDeveloped to benchmark and explore the full capabilities of the Venice.ai API, the venice-ai Python package has evolved into a comprehensive client library for developers. This library provides convenient access to Venice.ai's powerful features, including chat completions, image generation, audio synthesis, embeddings, model management, API key management, billing information, and more, with support for both synchronous and asynchronous operations.\n\n## Powered by\n\n\u003cdiv align=\"center\"\u003e\n  \u003ca href=\"https://venice.ai/chat?ref=6sxLV1\"\u003e\n    \u003cimg src=\"./venice-logo-lockup-red.svg\" alt=\"Venice.ai\" width=\"250\"\u003e\n  \u003c/a\u003e\n  \u003csub\u003e\u003cem\u003e*This is a referral link\u003c/em\u003e\u003c/sub\u003e\n\u003c/div\u003e\n\n## Table of Contents\n\n- [Features](#features)\n- [Getting Started](#getting-started)\n  - [Prerequisites](#prerequisites)\n  - [Installation](#installation)\n  - [API Key Setup](#api-key-setup)\n- [Quick Start](#quick-start)\n- [Usage](#usage)\n  - [Client Initialization](#client-initialization)\n  - [Chat Completions](#chat-completions)\n  - [Image Generation](#image-generation)\n  - [Audio Synthesis (Text-to-Speech)](#audio-synthesis-text-to-speech)\n  - [Embeddings Creation](#embeddings-creation)\n  - [Model Management](#model-management)\n  - [API Key Management](#api-key-management)\n  - [Billing Information](#billing-information)\n- [Error Handling](#error-handling)\n- [Advanced Usage](#advanced-usage)\n  - [Advanced HTTP Client Configuration](#advanced-http-client-configuration)\n  - [Understanding Streaming](#understanding-streaming)\n  - [Token Estimation](#token-estimation)\n- [Showcase Application](#showcase-application)\n- [Testing](#testing)\n- [Documentation](#documentation)\n- [Contributing](#contributing)\n- [License](#license)\n\n## Features\n\n- Intuitive Pythonic interface for all Venice.ai API endpoints.\n- Support for both synchronous and asynchronous operations.\n- Access to all major Venice AI model families including text generation, image synthesis, audio processing, and embedding models.\n- Streaming capabilities for chat completions and audio.\n- Built-in utilities for tasks like token estimation and chat message validation.\n- Robust error handling with a custom exception hierarchy.\n- Type-hinted for a better developer experience and static analysis.\n- Resource-oriented client design (e.g., `client.chat`, `client.image`).\n- Automatic request retries for transient errors and configurable HTTP status codes.\n- Comprehensive testing suite with `test_runner.py` for easy execution.\n- Detailed API documentation generated with Sphinx.\n- Support for `logprobs` and `top_logprobs` in chat completions to retrieve token likelihoods.\n\n## Getting Started\n\n### Prerequisites\n\nPython 3.11 or higher.\n\n### Installation\n\nYou can install the Venice AI client library from PyPI:\n\n```bash\npip install venice-ai\n```\n\nTo include optional dependencies for token estimation:\n\n```bash\npip install venice-ai[tokenizers]\n```\n\nAlternatively, to install the latest development version from source (recommended if you want to contribute or need the absolute latest changes not yet released on PyPI):\n\n```bash\ngit clone https://github.com/sethbang/venice-ai.git # Or your fork\ncd venice-ai\npoetry install\n```\n\nNote: `poetry install` installs main dependencies. For development or running all tests, install with dev dependencies: `poetry install --with dev`. To include optional tokenizers support: `poetry install --extras \"tokenizers\"`.\n\n### API Key Setup\n\nTo use the Venice AI API, you need an API key. You can obtain your API key from your Venice AI dashboard.\n\nThe client library expects the API key to be available as an environment variable (recommended):\n\n```bash\nexport VENICE_API_KEY=\"your_api_key_here\"\n```\n\nAlternatively, you can pass the API key directly when initializing the client:\n`client = VeniceClient(api_key=\"your_api_key_here\")`\n\n## Quick Start\n\nGet up and running in seconds!\n\n**Synchronous Chat Completion:**\n\n```python\nfrom venice_ai import VeniceClient\n\n# Assumes VENICE_API_KEY is set in your environment\n# Recommended: use as a context manager\nwith VeniceClient(default_timeout=60.0) as client: # Added default_timeout\n    try:\n        response = client.chat.completions.create(\n            model=\"llama-3.2-3b\", # Or your preferred model\n            messages=[{\"role\": \"user\", \"content\": \"Hello, Venice AI!\"}]\n        )\n        print(response.choices[0].message.content)\n    except Exception as e:\n        print(f\"An error occurred: {e}\")\n```\n\n**Asynchronous Chat Completion:**\n\n```python\nimport asyncio\nfrom venice_ai import AsyncVeniceClient\n\nasync def main():\n    # Assumes VENICE_API_KEY is set in your environment\n    # Recommended: use as an async context manager\n    async with AsyncVeniceClient(default_timeout=60.0) as async_client: # Added default_timeout\n        try:\n            response = await async_client.chat.completions.create(\n                model=\"llama-3.2-3b\", # Or your preferred model\n                messages=[{\"role\": \"user\", \"content\": \"Hello asynchronously, Venice AI!\"}]\n            )\n            print(response.choices[0].message.content)\n        except Exception as e:\n            print(f\"An error occurred: {e}\")\n\nif __name__ == \"__main__\":\n    asyncio.run(main())\n```\n\n## Usage\n\n### Client Initialization\n\n**Synchronous Client:**\n\n```python\nfrom venice_ai import VeniceClient\n\n# Option 1: API key from environment variable VENICE_API_KEY, with default timeout\nclient = VeniceClient(default_timeout=30.0) # Added default_timeout\n\n# Option 2: Pass API key directly and custom timeout\n# client = VeniceClient(api_key=\"your_api_key_here\", default_timeout=45.0)\n\n# Using as a context manager (recommended for proper resource cleanup):\nwith VeniceClient(api_key=\"your_api_key_here\", default_timeout=30.0) as client: # Added default_timeout\n    # Use the client for API calls\n    models_list = client.models.list()\n    print(f\"Found {len(models_list.data)} models.\")\n\n# If not using a context manager, remember to close the client:\n# client.close()\n```\n\n**Asynchronous Client:**\n\n```python\nimport asyncio\nfrom venice_ai import AsyncVeniceClient\n\nasync def main():\n    # Option 1: API key from environment variable VENICE_API_KEY, with default timeout\n    async_client = AsyncVeniceClient(default_timeout=30.0) # Added default_timeout\n\n    # Option 2: Pass API key directly and custom timeout\n    # async_client = AsyncVeniceClient(api_key=\"your_api_key_here\", default_timeout=45.0)\n\n    # Using as an async context manager (recommended):\n    async with AsyncVeniceClient(api_key=\"your_api_key_here\", default_timeout=30.0) as async_client: # Added default_timeout\n        # Use the client for API calls\n        models_list = await async_client.models.list()\n        print(f\"Found {len(models_list.data)} models.\")\n\n    # If not using an async context manager, remember to close the client:\n    # await async_client.close()\n\nif __name__ == \"__main__\":\n    asyncio.run(main())\n```\n\nIt's important to `close()` (for `VeniceClient`) or `await async_client.close()` (for `AsyncVeniceClient`) when you're finished, if not using context managers. This ensures that underlying HTTP resources and connections are properly released.\n\n### Chat Completions\n\nThe response objects (`response`, `chunk`) are Pydantic models. You can explore their structure for more details (see `src/venice_ai/types/chat.py` or the Sphinx-generated API documentation).\n\n**Model Context Windows:**\n\nDifferent models support different context window sizes. For example, the \"Venice Large\" model supports up to 128k tokens, allowing for extensive conversations or document processing. Use the `max_completion_tokens` parameter to control response length within the model's context limits.\n\n**Parameters:**\n\n:param logit_bias: Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100.\n:type logit_bias: Optional[Dict[str, int]]\n\n:param logprobs: Whether to return log probabilities of the output tokens. If `True`, the `logprobs` field will be populated in the `choices` of the response. Defaults to `False`.\n:type logprobs: Optional[bool]\n\n:param parallel_tool_calls: Whether to enable parallel function calling during tool use.\n:type parallel_tool_calls: Optional[bool]\n\n:param top_logprobs: An integer between 0 and 5 specifying the number of most likely tokens to return at each token position, each with an associated log probability. Requires `logprobs` to be `True`.\n:type top_logprobs: Optional[int]\n\n**Non-streaming example:**\n\n```python\nfrom venice_ai import VeniceClient\n\nwith VeniceClient() as client:\n    response = client.chat.completions.create(\n        model=\"llama-3.2-3b\", # Or your preferred model\n        messages=[\n            {\"role\": \"user\", \"content\": \"Hello, how are you?\"}\n        ]\n    )\n    print(response.choices[0].message.content)\n```\n\n**Example with Venice Large (128k context window):**\n\n```python\nfrom venice_ai import VeniceClient\n\nwith VeniceClient() as client:\n    # Venice Large supports up to 128k tokens, ideal for long documents or conversations\n    response = client.chat.completions.create(\n        model=\"venice-large\",\n        messages=[\n            {\"role\": \"system\", \"content\": \"You are a helpful assistant that can analyze long documents.\"},\n            {\"role\": \"user\", \"content\": \"Please analyze this extensive document...\"}  # Can include very long content\n        ],\n        max_completion_tokens=4000  # Can use higher values with Venice Large's 128k context\n    )\n    print(response.choices[0].message.content)\n```\n\n**Example with `logprobs` and `top_logprobs`:**\n\n```python\nfrom venice_ai import VeniceClient\n\nwith VeniceClient() as client:\n    response = client.chat.completions.create(\n        model=\"llama-3.2-3b\", # Or your preferred model\n        messages=[\n            {\"role\": \"user\", \"content\": \"What is the color of the sky?\"}\n        ],\n        logprobs=True,\n        top_logprobs=2 # Request the top 2 most likely tokens at each position\n    )\n\n    # Example of accessing logprobs data\n    if response.choices and response.choices[0].logprobs:\n        print(\"Logprobs received.\")\n        first_choice_logprobs = response.choices[0].logprobs\n        if first_choice_logprobs.content:\n            for i, token_logprob in enumerate(first_choice_logprobs.content[:2]): # Display for first 2 generated tokens\n                print(f\"Token {i+1}: '{token_logprob.token}' (logprob: {token_logprob.logprob:.4f})\")\n                if token_logprob.top_logprobs:\n                    print(f\"  Top {len(token_logprob.top_logprobs)} alternative tokens:\")\n                    for alt_token_logprob in token_logprob.top_logprobs:\n                        print(f\"    - '{alt_token_logprob.token}' (logprob: {alt_token_logprob.logprob:.4f})\")\n    else:\n        print(\"No logprobs data in response or choices.\")\n\n    print(f\"\\nMain response content: {response.choices[0].message.content}\")\n```\n\n**Streaming example:**\n\n```python\nfrom venice_ai import VeniceClient\n\nwith VeniceClient() as client:\n    stream = client.chat.completions.create(\n        model=\"llama-3.2-3b\",\n        messages=[\n            {\"role\": \"user\", \"content\": \"Tell me a short story.\"}\n        ],\n        stream=True\n    )\n    for chunk in stream:\n        if chunk.choices and chunk.choices[0].delta.content:\n            print(chunk.choices[0].delta.content, end=\"\")\n    print()\n```\n\n**Tool Calling Example:**\n(Ensure the selected model supports tool calling)\n\n```python\nfrom venice_ai import VeniceClient\n\nwith VeniceClient() as client:\n    response = client.chat.completions.create(\n        model=\"llama-3.2-3b\", # Choose a model that supports tool calls\n        messages=[{\"role\": \"user\", \"content\": \"What's the weather in London?\"}],\n        tools=[{\n            \"type\": \"function\",\n            \"function\": {\n                \"name\": \"get_current_weather\",\n                \"description\": \"Get the current weather in a given location\",\n                \"parameters\": {\n                    \"type\": \"object\",\n                    \"properties\": {\n                        \"location\": {\"type\": \"string\", \"description\": \"The city and state, e.g., San Francisco, CA\"},\n                        \"unit\": {\"type\": \"string\", \"enum\": [\"celsius\", \"fahrenheit\"]}\n                    },\n                    \"required\": [\"location\"]\n                }\n            }\n        }],\n        tool_choice=\"auto\" # Can be \"auto\", \"none\", or a specific tool\n    )\n    if response.choices[0].message.tool_calls:\n        tool_call = response.choices[0].message.tool_calls[0]\n        print(f\"Tool call requested: {tool_call.function.name}\")\n        print(f\"Arguments: {tool_call.function.arguments}\")\n    else:\n        print(response.choices[0].message.content)\n```\n\n### Image Generation\n\n**Basic Generation:**\n\n```python\nfrom venice_ai import VeniceClient\nimport base64\nfrom PIL import Image\nimport io\n\nwith VeniceClient() as client:\n    response = client.image.generate(\n        model=\"venice-sd35\", # Or your preferred image model\n        prompt=\"A futuristic cityscape at sunset\",\n        width=1024,\n        height=1024,\n        steps=25, # Example of another parameter\n        # negative_prompt=\"blurry, low quality\", # Example\n        # style_preset=\"cinematic\" # Example\n    )\n    if response.images:\n        img_b64 = response.images[0]\n        img_bytes = base64.b64decode(img_b64)\n        # pil_image = Image.open(io.BytesIO(img_bytes))\n        # pil_image.show()\n        # pil_image.save(\"generated_image.png\")\n        print(\"Image generated successfully (first image data received).\")\n```\n\n**Simple Generation (OpenAI-compatible):**\n\n```python\nwith VeniceClient() as client:\n    response = client.image.simple_generate(\n        model=\"venice-sd35\",\n        prompt=\"A cute cat wearing a small hat\",\n        size=\"512x512\"\n    )\n    # Process response.data[0].b64_json or response.data[0].url\n```\n\n**Image Upscaling:**\n\n```python\nwith VeniceClient() as client:\n    with open(\"path/to/your/image.png\", \"rb\") as img_file:\n        upscaled_image_bytes = client.image.upscale(\n            image=img_file, # Can be path, bytes, or file-like object\n            scale=2.0 # Example: 2x upscale\n        )\n    with open(\"upscaled_image.png\", \"wb\") as f:\n        f.write(upscaled_image_bytes)\n```\n\n**Listing Image Styles:**\n\n```python\nwith VeniceClient() as client:\n    styles_response = client.image.list_styles()\n    for style in styles_response.data:\n        print(f\"Available style: {style}\")\n```\n\n### Audio Synthesis (Text-to-Speech)\n\nGenerate speech from text.\n\n**Synchronous Example:**\n\n```python\nfrom venice_ai import VeniceClient\nfrom venice_ai.types.audio import Voice\n\nwith VeniceClient() as client:\n    audio_bytes = client.audio.create_speech(\n        model=\"tts-kokoro\", # Or your preferred TTS model\n        input=\"Hello from Venice AI audio!\",\n        voice=Voice.KOKORO_DEFAULT # Or other voice options like \"kokoro-style-energetic\"\n    )\n    with open(\"output.mp3\", \"wb\") as f:\n        f.write(audio_bytes)\n    print(\"Audio saved to output.mp3\")\n```\n\n**Streaming Example (Conceptual):**\n\n```python\n# with VeniceClient() as client: # or AsyncVeniceClient\n#     audio_stream = client.audio.create_speech(\n#         model=\"tts-kokoro\",\n#         input=\"This is a streamed audio example.\",\n#         voice=Voice.KOKORO_DEFAULT,\n#         stream=True\n#     )\n#     with open(\"streamed_output.mp3\", \"wb\") as f:\n#         for chunk in audio_stream:\n#             f.write(chunk)\n#     print(\"Streamed audio saved.\")\n```\n\n### Embeddings Creation\n\nThe embeddings endpoint now supports `base64` encoding and accepts a `user` parameter for improved OpenAI compatibility (though the `user` parameter is discarded by the Venice API).\n\n```python\nfrom venice_ai import VeniceClient\n\nwith VeniceClient() as client:\n    response = client.embeddings.create(\n        model=\"text-embedding-bge-m3\", # Or your preferred embeddings model\n        input=\"The Venice AI Python client makes API interaction seamless.\",\n        # input=[\"Batch sentence 1\", \"Batch sentence 2\"] # For batching\n        encoding_format=\"float\", # Can be \"float\" or \"base64\"\n        user=\"user-123\" # Accepted for OpenAI compatibility, but has no effect\n    )\n    if response.data and response.data[0].embedding:\n        first_embedding_vector = response.data[0].embedding\n        print(f\"Generated embedding vector (first 5 dimensions): {first_embedding_vector[:5]}\")\n        print(f\"Total dimensions of the vector: {len(first_embedding_vector)}\")\n    else:\n        print(\"No embedding data received.\")\n```\n\n### Model Management\n\n```python\nfrom venice_ai import VeniceClient\n\nwith VeniceClient() as client:\n    # List all models\n    models_list = client.models.list()\n    print(f\"Total models available: {len(models_list.data)}\")\n    if models_list.data:\n        print(f\"First model: {models_list.data[0].id}\")\n\n    # List text models\n    text_models = client.models.list(type=\"text\")\n    print(f\"Text models: {[m.id for m in text_models.data]}\")\n```\n\n**Note:** Different models have varying capabilities and context window sizes. The SDK now exposes new model capabilities like `supportsVision`, `supportsReasoning`, and `quantization`. You can use the `get_filtered_models` utility to select models based on these capabilities. For example, \"Venice Large\" supports up to 128k tokens, making it ideal for processing extensive documents or maintaining long conversations. Refer to the official Venice AI documentation for detailed specifications of each model.\n\n### API Key Management\n\n```python\nfrom venice_ai import VeniceClient\n\n# Ensure your client is initialized with an ADMIN API key for these operations\n# with VeniceClient(api_key=\"YOUR_ADMIN_API_KEY\") as client:\n    # List API Keys\n    # api_keys_list = client.api_keys.list()\n    # print(f\"Found {len(api_keys_list.data)} API keys.\")\n\n    # Create an API Key (handle the returned key securely - it's shown only once!)\n    # new_key_response = client.api_keys.create(\n    #     description=\"My new inference key\",\n    #     apiKeyType=\"INFERENCE\"\n    # )\n    # print(f\"Created new API key (prefix): {new_key_response.data.last6Chars}\")\n    # print(f\"IMPORTANT: Full API Key: {new_key_response.data.apiKey} - Store it securely NOW!\")\n```\n\n**Warning:** API Key management operations typically require admin privileges. The API key used to initialize the client must have sufficient permissions.\n\n### Billing Information\n\n```python\nfrom venice_ai import VeniceClient\n\n# with VeniceClient() as client:\n    # usage_data = client.billing.get_usage()\n    # print(f\"Total VCU used: {usage_data.total_vcu_consumed}\")\n    # print(f\"Total USD used: {usage_data.total_usd_consumed}\")\n```\n\nFor more detailed examples of other functionalities (Characters, specific parameters for each endpoint), please refer to the [Showcase Application](#showcase-application) and the official [API Documentation](#documentation).\n\n## Error Handling\n\nThe Venice AI client library uses a custom hierarchy of exceptions to help you handle API errors gracefully. All library-specific errors inherit from `venice_ai.exceptions.VeniceError`.\n\nCommon exceptions include:\n\n- `venice_ai.exceptions.APIError`: Base class for errors returned by the API (e.g., HTTP 4xx, 5xx).\n- `venice_ai.exceptions.AuthenticationError`: For API key issues (HTTP 401).\n- `venice_ai.exceptions.InvalidRequestError`: For bad request parameters (HTTP 400).\n- `venice_ai.exceptions.RateLimitError`: When API rate limits are exceeded (HTTP 429).\n- `venice_ai.exceptions.PaymentRequiredError`: When payment is required (HTTP 402).\n- `venice_ai.exceptions.ServiceUnavailableError`: When the service is temporarily unavailable (HTTP 503).\n- `venice_ai.exceptions.NotFoundError`: For non-existent resources (HTTP 404).\n- `venice_ai.exceptions.APIConnectionError`: For network connectivity issues.\n- `venice_ai.exceptions.APITimeoutError`: For request timeouts.\n\nExample:\n\n```python\nfrom venice_ai import VeniceClient, exceptions\n\ntry:\n    with VeniceClient() as client:\n        response = client.chat.completions.create(\n            model=\"non_existent_model\",\n            messages=[{\"role\": \"user\", \"content\": \"Test\"}]\n        )\nexcept exceptions.NotFoundError as e:\n    print(f\"Model not found: {e}\")\nexcept exceptions.AuthenticationError as e:\n    print(f\"Authentication failed. Check your API key: {e}\")\nexcept exceptions.APIError as e:\n    print(f\"An API error occurred: {e.status_code} - {e}\")\nexcept exceptions.VeniceError as e:\n    print(f\"A Venice AI client error occurred: {e}\")\n```\n\nAlways check the specific exception type and its attributes (like `e.status_code`, `e.request`, `e.response`) for more details.\n\n## Advanced Usage\n\n### Advanced HTTP Client Configuration\n\nThe SDK allows for advanced configuration of the underlying `httpx` client, enabling scenarios like custom mTLS, specific proxy setups, or detailed transport logging. There are now three main ways to achieve this:\n\n1.  **Passing a Pre-configured `httpx.Client` / `httpx.AsyncClient`**:\n    You can provide your own `httpx.Client` or `httpx.AsyncClient` instance. The SDK will use it directly but will still manage `base_url`, `timeout` (if not more specific on your client), and authentication. **You are responsible for closing your client instance.**\n\n    ```python\n    import httpx\n    from venice_ai import VeniceClient # or AsyncVeniceClient\n\n    # Synchronous example\n    my_custom_httpx_client = httpx.Client(proxies={\"all://\": \"http://localhost:8080\"}, timeout=60.0)\n    try:\n        client = VeniceClient(api_key=\"YOUR_API_KEY\", http_client=my_custom_httpx_client)\n        # Use the client...\n        models = client.models.list()\n        print(f\"Found {len(models.data)} models.\")\n    finally:\n        # Important: Close your custom client if it's not managed elsewhere (e.g., as a context manager)\n        if not my_custom_httpx_client.is_closed:\n            my_custom_httpx_client.close()\n\n    # For AsyncVeniceClient, use httpx.AsyncClient and await its aclose() method.\n    ```\n\n2.  **Passing Common `httpx` Settings Directly**:\n    If you don't provide an `http_client` instance, you can pass `httpx.Client` / `httpx.AsyncClient` constructor arguments (e.g., `proxy`, `transport`, `limits`, `verify`) directly to the SDK client. The SDK will create and manage its internal `httpx` client with these settings.\n\n    ```python\n    from venice_ai import VeniceClient # or AsyncVeniceClient\n    import httpx # For httpx.Limits, httpx.HTTPTransport\n\n    # Synchronous example\n    client = VeniceClient(\n        api_key=\"YOUR_API_KEY\",\n        proxies={\"all://\": \"http://localhost:8080\"}, # Example proxy\n        transport=httpx.HTTPTransport(retries=3),    # Example custom transport\n        limits=httpx.Limits(max_connections=50),     # Example connection limits\n        verify=False,                                # Example: disable SSL verification (use with caution)\n        default_timeout=30.0                         # Global default timeout for requests\n    )\n    with client: # SDK manages the internal httpx client's lifecycle\n        models = client.models.list()\n        print(f\"Found {len(models.data)} models.\")\n\n    # AsyncVeniceClient works similarly with corresponding async httpx types.\n    ```\n\n3.  **Configuring Automatic Retries**:\n    The SDK automatically retries requests on transient network errors and specific HTTP status codes (e.g., 429, 500, 502, 503, 504) by leveraging `httpx-retries`. This behavior is configurable through the following parameters in the `VeniceClient` and `AsyncVeniceClient` constructors:\n\n    - `max_retries` (int, default: 2): Maximum number of retries.\n    - `retry_backoff_factor` (float, default: 0.1): Backoff factor for calculating delay between retries.\n    - `retry_status_forcelist` (list[int], default: `[429, 500, 502, 503, 504]`): HTTP status codes to retry on.\n    - `retry_respect_retry_after_header` (bool, default: True): Whether to respect `Retry-After` headers.\n\n    ```python\n    from venice_ai import VeniceClient # or AsyncVeniceClient\n\n    # Example: Customize retry behavior\n    client = VeniceClient(\n        api_key=\"YOUR_API_KEY\",\n        max_retries=5,\n        retry_backoff_factor=0.5,\n        retry_status_forcelist=[429, 500, 502, 503, 504, 520], # Adding 520 to the list\n        retry_respect_retry_after_header=True, # Explicitly setting\n        default_timeout=30.0 # Global default timeout for requests\n    )\n    with client:\n        # Use the client...\n        try:\n            models = client.models.list()\n            print(f\"Found {len(models.data)} models with custom retry settings.\")\n        except Exception as e:\n            print(f\"An error occurred: {e}\")\n\n    # AsyncVeniceClient works similarly.\n    ```\n\nFor a detailed explanation of all supported parameters and more advanced use cases, please refer to the \"Advanced HTTP Client Configuration\" section in our [API Reference documentation](docs/api.rst).\n\n### Understanding Streaming\n\nWhen `stream=True` is used (e.g., in `chat.completions.create`), the method returns an iterator (`Stream` or `AsyncStream` object from `venice_ai.streaming`). These objects wrap the raw data chunks from the API.\n\n```python\nfrom venice_ai import VeniceClient\nfrom venice_ai.streaming import Stream # For type hinting or direct use if needed\n\nwith VeniceClient() as client:\n    stream_response: Stream = client.chat.completions.create( # type: ignore\n        model=\"llama-3.2-3b\",\n        messages=[{\"role\": \"user\", \"content\": \"Tell me a very long story about a brave knight.\"}],\n        stream=True,\n        max_completion_tokens=50 # Example: limit stream length\n    )\n    full_story = []\n    for chunk in stream_response:\n        # chunk is a dict representing ChatCompletionChunk\n        if chunk.choices and chunk.choices[0].delta.content:\n            # print(chunk.choices[0].delta.content, end=\"\") # For live printing\n            full_story.append(chunk.choices[0].delta.content)\n    # print(\"\\n--- End of Story ---\")\n    # final_text = \"\".join(full_story)\n    # print(f\"\\nFull story assembled: {final_text[:100]}...\")\n```\n\nThe `Stream` and `AsyncStream` wrappers handle resource management and provide a consistent iteration interface.\n\n### Token Estimation\n\nThe library includes a utility for estimating token counts:\n\n```python\nfrom venice_ai.utils import estimate_token_count\n\ntext = \"This is some sample text to estimate tokens for.\"\n# By default, uses cl100k_base encoding (common for many OpenAI models)\n# and falls back to a heuristic if tiktoken is not installed.\ncount = estimate_token_count(text)\nprint(f\"Estimated tokens: {count}\")\n```\n\n**Note:** For accurate token counting, install the optional `tiktoken` dependency:\n\n```bash\npip install venice-ai[tokenizers]\n# or with poetry:\npoetry install --extras \"tokenizers\"\n```\n\nWithout `tiktoken`, the library will use a simple heuristic estimation method.\n\n## Showcase Application\n\nThis project focuses on the core Venice AI Python SDK. For an interactive demonstration of the library's capabilities, check out our separate [Venice AI Streamlit Demo](https://github.com/venice-ai/streamlit-demo) repository, which provides a comprehensive UI for chat, image generation, audio synthesis, model listing, and more.\n\nThe demo is now available as a separate repository for easier deployment and to keep this package's dependencies minimal.\n\n## Testing\n\nThe library includes a comprehensive test suite using `pytest`.\n\nTo run all tests (unit, E2E, benchmarks) and generate a coverage report:\n\n```bash\n# Ensure dev dependencies are installed:\npoetry install --with dev\npoetry run python test_runner.py --group all --coverage --html\n```\n\nThe test runner ([`test_runner.py`](test_runner.py)) also supports an interactive mode and options to run specific test groups or files. Run `poetry run python test_runner.py --help` for more options.\n\n## Documentation\n\nThis project, venice-ai, has docs @ [venice-ai docs](https://venice-ai.readthedocs.io/)\n\nThis Python client library is generated using Sphinx from the docstrings within the codebase. It can be built locally from the `docs/` directory.\n\nDetailed API documentation for the Venice.ai API itself is available @ [https://docs.venice.ai/api-reference](https://docs.venice.ai/api-reference)\n\n## Contributing\n\nContributions are welcome! Please feel free to open issues for bugs or feature requests. If you'd like to contribute code, please see our [`CONTRIBUTING.md`](CONTRIBUTING.md) for guidelines on reporting issues.\n\n## License\n\nThis project is licensed under the MIT License.\n\n---\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsethbang%2Fvenice-ai","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsethbang%2Fvenice-ai","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsethbang%2Fvenice-ai/lists"}