{"id":30177586,"url":"https://github.com/ebowwa/ai-proxy-core","last_synced_at":"2026-04-29T22:33:54.365Z","repository":{"id":307595172,"uuid":"1030065528","full_name":"ebowwa/ai-proxy-core","owner":"ebowwa","description":"Minimal, stateless AI service proxy for Gemini and other LLMs","archived":false,"fork":false,"pushed_at":"2025-09-07T19:46:06.000Z","size":13908,"stargazers_count":0,"open_issues_count":11,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-09-07T21:27:27.005Z","etag":null,"topics":["ai","api-proxy","async","gemini","library","llm","multimodal","ollama","openai","opentelemetry","python","websocket"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/ai-proxy-core/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ebowwa.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-08-01T03:20:47.000Z","updated_at":"2025-09-07T19:46:09.000Z","dependencies_parsed_at":"2025-08-01T04:47:12.751Z","dependency_job_id":"417d84ce-e414-4146-86f7-4c442c52378e","html_url":"https://github.com/ebowwa/ai-proxy-core","commit_stats":null,"previous_names":["ebowwa/ai-proxy-core"],"tags_count":18,"template":false,"template_full_name":null,"purl":"pkg:github/ebowwa/ai-proxy-core","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ebowwa%2Fai-proxy-core","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ebowwa%2Fai-proxy-core/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ebowwa%2Fai-proxy-core/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ebowwa%2Fai-proxy-core/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ebowwa","download_url":"https://codeload.github.com/ebowwa/ai-proxy-core/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ebowwa%2Fai-proxy-core/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32446820,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-29T22:27:22.272Z","status":"ssl_error","status_checked_at":"2026-04-29T22:10:49.234Z","response_time":110,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","api-proxy","async","gemini","library","llm","multimodal","ollama","openai","opentelemetry","python","websocket"],"created_at":"2025-08-12T05:00:24.602Z","updated_at":"2026-04-29T22:33:54.360Z","avatar_url":"https://github.com/ebowwa.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AI Proxy Core\n\nA unified Python package providing a single interface for AI completions across multiple providers (OpenAI, Gemini, Ollama), plus **image generation capabilities** (v0.4.0+). Features intelligent model management, automatic provider routing, zero-config setup, and abstract image generation with DALL-E 3.\n\n\u003e 💡 **Why not LangChain?** Read our [philosophy and architectural rationale](https://github.com/ebowwa/ai-proxy-core/issues/13) for choosing simplicity over complexity.\n\n\u003e 🎯 **What's Next?** See our [wrapper layer roadmap](https://github.com/ebowwa/ai-proxy-core/issues/14) for planned features and what belongs in a clean LLM wrapper.\n\n## Installation\n\nBasic (Google Gemini only):\n```bash\npip install ai-proxy-core\n```\n\nWith specific providers (optional dependencies):\n```bash\npip install ai-proxy-core[openai]     # OpenAI support (includes image generation)\npip install ai-proxy-core[anthropic]  # Anthropic support (coming soon)\npip install ai-proxy-core[telemetry]  # OpenTelemetry support\npip install ai-proxy-core[all]        # Everything\n```\n\nOr install from source:\n```bash\ngit clone https://github.com/ebowwa/ai-proxy-core.git\ncd ai-proxy-core\npip install -e .\n# With all extras: pip install -e \".[all]\"\n```\n\n## Quick Start\n\n\u003e 🤖 **AI Integration Help**: \n\u003e - **Using the library?** Copy our [user agent prompt](.claude/agents/ai-proxy-core-user.md) to any LLM for instant integration guidance and code examples\n\u003e - **Developing the library?** Use our [developer agent prompt](.claude/agents/ai-proxy-core-developer.md) for architecture details and contribution help\n\n### Unified Interface (Recommended)\n\n```python\nfrom ai_proxy_core import CompletionClient\n\n# Single client for all providers\nclient = CompletionClient()\n\n# Works with any model - auto-detects provider\nresponse = await client.create_completion(\n    messages=[{\"role\": \"user\", \"content\": \"Hello!\"}],\n    model=\"gpt-4\"  # Auto-routes to OpenAI\n)\n\nresponse = await client.create_completion(\n    messages=[{\"role\": \"user\", \"content\": \"Hello!\"}],\n    model=\"gemini-1.5-flash\"  # Auto-routes to Gemini\n)\n\nresponse = await client.create_completion(\n    messages=[{\"role\": \"user\", \"content\": \"Hello!\"}],\n    model=\"llama2\"  # Auto-routes to Ollama\n)\n\n# All return the same standardized format\nprint(response[\"choices\"][0][\"message\"][\"content\"])\n```\n\n### Intelligent Model Selection\n\n```python\n# Find the best model for your needs\nbest_model = await client.find_best_model({\n    \"multimodal\": True,\n    \"min_context_limit\": 32000,\n    \"local_preferred\": False\n})\n\nresponse = await client.create_completion(\n    messages=[{\"role\": \"user\", \"content\": \"Describe this image\"}],\n    model=best_model[\"id\"]\n)\n```\n\n### Model Discovery\n\n```python\n# List all available models across providers\nmodels = await client.list_models()\nfor model in models:\n    print(f\"{model['id']} ({model['provider']}) - {model['context_limit']:,} tokens\")\n\n# List models from specific provider\nopenai_models = await client.list_models(provider=\"openai\")\n```\n\n## Ollama Integration\n\n### Prerequisites\n```bash\n# Install Ollama from https://ollama.ai\n# Start Ollama service\nollama serve\n\n# Pull a model\nollama pull llama3.2\n```\n\n### Using Ollama with CompletionClient\n```python\nfrom ai_proxy_core import CompletionClient, ModelManager\n\n# Option 1: Auto-detection (Ollama will be detected if running)\nclient = CompletionClient()\n\n# Option 2: With custom ModelManager\nmanager = ModelManager()\nclient = CompletionClient(model_manager=manager)\n\n# List Ollama models\nmodels = await client.list_models(provider=\"ollama\")\nprint(f\"Available Ollama models: {[m['id'] for m in models]}\")\n\n# Create completion\nresponse = await client.create_completion(\n    messages=[{\"role\": \"user\", \"content\": \"Hello!\"}],\n    model=\"llama3.2\",\n    provider=\"ollama\",  # Optional, auto-detected from model name\n    temperature=0.7\n)\n```\n\n### Direct Ollama Usage\n```python\nfrom ai_proxy_core import OllamaCompletions\n\nollama = OllamaCompletions()\n\n# List available models\nmodels = ollama.list_models()\nprint(f\"Available models: {models}\")\n\n# Create completion\nresponse = await ollama.create_completion(\n    messages=[{\"role\": \"user\", \"content\": \"Explain quantum computing\"}],\n    model=\"llama3.2\",\n    temperature=0.7,\n    max_tokens=500\n)\n```\n\nSee [examples/ollama_complete_guide.py](examples/ollama_complete_guide.py) for comprehensive examples including error handling, streaming, and advanced features.\n\n## Image Generation (v0.4.1+)\n\n### Generate Images with Explicit Model Selection\n\n```python\nfrom ai_proxy_core import OpenAIImageProvider, ImageModel, ImageSize, ImageQuality, ImageStyle\n\n# Initialize the provider\nprovider = OpenAIImageProvider(api_key=\"your-openai-api-key\")\n\n# Generate with DALL-E 3 (explicitly specify model)\nresponse = provider.generate(\n    model=ImageModel.DALLE_3,   # Required: specify which model\n    prompt=\"A modern app icon with turquoise background and camera symbol\",\n    size=ImageSize.SQUARE,       # 1024x1024\n    quality=ImageQuality.HD,      # HD or STANDARD\n    style=ImageStyle.VIVID        # VIVID or NATURAL (DALL-E 3 only)\n)\n\n# Generate with GPT-Image-1 (better instruction following)\nresponse = provider.generate(\n    model=ImageModel.GPT_IMAGE_1,  # Explicitly use GPT-Image-1\n    prompt=\"Create a detailed app icon following these specifications...\",\n    size=\"4096x4096\",              # Supports up to 4K resolution\n    quality=\"high\"                 # low, medium, high, or auto\n)\n\n# Access the generated image\nwith open(\"icon.png\", \"wb\") as f:\n    f.write(response[\"images\"][\"data\"])\n\n# Token usage for GPT-Image-1\nif response.get(\"usage\"):\n    print(f\"Tokens used: {response['usage']['total_tokens']}\")\n```\n\n### Available Models\n\n```python\n# List available models and their capabilities\nfor model in provider.list_models():\n    print(f\"Model: {model['id']}\")\n    print(f\"  Sizes: {model['capabilities']['sizes']}\")\n    print(f\"  Features: {model['capabilities']['features']}\")\n\n# Models:\n# - dall-e-2: Multiple images, editing, 256x256 to 1024x1024\n# - dall-e-3: Styles, HD quality, up to 1792x1024\n# - gpt-image-1: Token pricing, 4K resolution, better instructions\n```\n\n### Gemini 2.5 Flash Image (Preview)\n\n```python\nfrom ai_proxy_core import CompletionClient\nimport asyncio, base64, os\n\nasync def main():\n    client = CompletionClient()\n    # Text-to-image\n    resp = await client.create_completion(\n        messages=[{\"role\":\"user\",\"content\":\"Photoreal banana on a desk\"}],\n        model=\"gemini-2.5-flash-image-preview\",\n        return_images=True,  # forces image modality even with text-only prompt\n    )\n    img = resp.get(\"images\")\n    if isinstance(img, list):\n        img = img[0] if img else None\n    if img and img.get(\"data\"):\n        with open(\"gemini_banana.jpg\", \"wb\") as f:\n            f.write(img[\"data\"])\n\n    # Edit (image + instruction)\n    def to_data_url(p):\n        with open(p, \"rb\") as f: b = f.read()\n        return \"data:image/jpeg;base64,\" + base64.b64encode(b).decode(\"utf-8\")\n\n    messages = [{\n        \"role\": \"user\",\n        \"content\": [\n            {\"type\":\"image_url\",\"image_url\":{\"url\": to_data_url(\"sample.jpg\")}},\n            {\"type\":\"text\",\"text\":\"Remove the background and add a soft shadow\"}\n        ]\n    }]\n    resp = await client.create_completion(\n        messages=messages,\n        model=\"gemini-2.5-flash-image-preview\",\n        return_images=True,\n    )\n    img = resp.get(\"images\")\n    if isinstance(img, list):\n        img = img[0] if img else None\n    if img and img.get(\"data\"):\n        with open(\"gemini_edit.jpg\", \"wb\") as f:\n            f.write(img[\"data\"])\n\nif __name__ == \"__main__\":\n    asyncio.run(main())\n```\n\n- Response schema remains OpenAI-like and non-breaking:\n  - Text (if any) in `choices[0].message.content`\n  - Image bytes in `response[\"images\"]` (single object) or a list if multiple\n- Aliases: `gemini-2.5-flash-image`, `g2.5-flash-image` route to preview for now\n\n### Edit Images (DALL-E 2 Only)\n\n```python\n# Image editing is only available with DALL-E 2\nresponse = provider.edit(\n    image=original_image_bytes,\n    prompt=\"Add a sunset background\",\n    model=ImageModel.DALLE_2,  # Only DALL-E 2 supports editing\n    mask=mask_bytes,           # Optional mask for inpainting\n    n=2                        # Generate 2 variations\n)\n```\n\n### Model-Specific Features\n\n```python\n# DALL-E 2: Generate multiple variations\nresponse = provider.generate(\n    model=ImageModel.DALLE_2,\n    prompt=\"App icon variations\",\n    n=5,  # Generate 5 variations\n    size=\"512x512\"\n)\n\n# DALL-E 3: Use styles for different aesthetics\nresponse = provider.generate(\n    model=ImageModel.DALLE_3,\n    prompt=\"Photorealistic app icon\",\n    style=ImageStyle.NATURAL,  # or VIVID\n    quality=ImageQuality.HD\n)\n\n# GPT-Image-1: High resolution with compression\nresponse = provider.generate(\n    model=ImageModel.GPT_IMAGE_1,\n    prompt=\"Ultra-detailed 4K app icon\",\n    size=\"4096x4096\",\n    quality=\"high\",\n    output_compression=95  # Optional compression\n)\n```\n\n## Advanced Usage\n\n### Provider-Specific Completions\n\nIf you need provider-specific features, you can still use the individual clients:\n\n```python\nfrom ai_proxy_core import GoogleCompletions, OpenAICompletions, OllamaCompletions\n\n# Google Gemini with safety settings\ngoogle = GoogleCompletions(api_key=\"your-gemini-api-key\")\nresponse = await google.create_completion(\n    messages=[{\"role\": \"user\", \"content\": \"Hello!\"}],\n    model=\"gemini-1.5-flash\",\n    safety_settings=[{\"category\": \"HARM_CATEGORY_HARASSMENT\", \"threshold\": \"BLOCK_MEDIUM_AND_ABOVE\"}]\n)\n\n# OpenAI with tool calling\nopenai = OpenAICompletions(api_key=\"your-openai-key\")\nresponse = await openai.create_completion(\n    messages=[{\"role\": \"user\", \"content\": \"What's the weather?\"}],\n    model=\"gpt-4\",\n    tools=[{\"type\": \"function\", \"function\": {\"name\": \"get_weather\"}}]\n)\n\n# Ollama for local models\nollama = OllamaCompletions()  # Auto-detects localhost:11434\nresponse = await ollama.create_completion(\n    messages=[{\"role\": \"user\", \"content\": \"Hello!\"}],\n    model=\"llama3.2\",\n    temperature=0.7\n)\n```\n\n### OpenAI-Compatible Endpoints\n\n```python\n# Works with any OpenAI-compatible API (Groq, Anyscale, Together, etc.)\ngroq = OpenAICompletions(\n    api_key=\"your-groq-key\",\n    base_url=\"https://api.groq.com/openai/v1\"\n)\n\nresponse = await groq.create_completion(\n    messages=[{\"role\": \"user\", \"content\": \"Hello!\"}],\n    model=\"mixtral-8x7b-32768\"\n)\n```\n\n### Gemini Live Session\n\n```python\nfrom ai_proxy_core import GeminiLiveSession\n\n# Example 1: Basic session (no system prompt)\nsession = GeminiLiveSession(api_key=\"your-gemini-api-key\")\n\n# Example 2: Session with system prompt (simple string format)\nsession = GeminiLiveSession(\n    api_key=\"your-gemini-api-key\",\n    system_instruction=\"You are a helpful voice assistant. Be concise and friendly.\"\n)\n\n# Example 3: Session with built-in tools enabled\nsession = GeminiLiveSession(\n    api_key=\"your-gemini-api-key\",\n    enable_code_execution=True,      # Enable Python code execution\n    enable_google_search=True,       # Enable web search\n    system_instruction=\"You are a helpful assistant with access to code execution and web search.\"\n)\n\n# Example 4: Session with custom function declarations\nfrom google.genai import types\n\ndef get_weather(location: str) -\u003e dict:\n    # Your custom function implementation\n    return {\"location\": location, \"temp\": 72, \"condition\": \"sunny\"}\n\nweather_function = types.FunctionDeclaration(\n    name=\"get_weather\",\n    description=\"Get current weather for a location\",\n    parameters=types.Schema(\n        type=\"OBJECT\",\n        properties={\n            \"location\": types.Schema(type=\"STRING\", description=\"City name\")\n        },\n        required=[\"location\"]\n    )\n)\n\nsession = GeminiLiveSession(\n    api_key=\"your-gemini-api-key\",\n    custom_tools=[types.Tool(function_declarations=[weather_function])],\n    system_instruction=\"You can help with weather information.\"\n)\n\n# Set up callbacks\nsession.on_audio = lambda data: print(f\"Received audio: {len(data)} bytes\")\nsession.on_text = lambda text: print(f\"Received text: {text}\")\nsession.on_function_call = lambda call: handle_function_call(call)\n\nasync def handle_function_call(call):\n    if call[\"name\"] == \"get_weather\":\n        result = get_weather(**call[\"args\"])\n        await session.send_function_result(result)\n\n# Start session\nawait session.start()\n\n# Send audio/text\nawait session.send_audio(audio_data)\nawait session.send_text(\"What's the weather in Boston?\")\n\n# Stop when done\nawait session.stop()\n```\n\n### Integration with FastAPI\n\n#### Chat Completions API\n```python\nfrom fastapi import FastAPI, HTTPException\nfrom pydantic import BaseModel\nfrom ai_proxy_core import CompletionClient\n\napp = FastAPI()\nclient = CompletionClient()\n\nclass CompletionRequest(BaseModel):\n    messages: list\n    model: str = \"gemini-1.5-flash\"\n    temperature: float = 0.7\n\n@app.post(\"/api/chat/completions\")\nasync def create_completion(request: CompletionRequest):\n    try:\n        response = await client.create_completion(\n            messages=request.messages,\n            model=request.model,\n            temperature=request.temperature\n        )\n        return response\n    except Exception as e:\n        raise HTTPException(status_code=500, detail=str(e))\n### Audio in Chat Completions (Gemini)\n\nThe completions endpoint supports audio inputs for Google/Gemini via multimodal content:\n- data URL audio_url entries (e.g., data:audio/mp3;base64,...)\n- OpenAI-style input_audio objects with base64 data and format\n\nExample request messages:\n```json\n[\n  {\n    \"role\": \"user\",\n    \"content\": [\n      {\"type\": \"text\", \"text\": \"Transcribe and summarize this audio:\"},\n      {\n        \"type\": \"audio_url\",\n        \"audio_url\": {\n          \"url\": \"data:audio/mp3;base64,AAA...\"\n        }\n      }\n    ]\n  }\n]\n```\n\nOpenAI-style input_audio:\n```json\n[\n  {\n    \"role\": \"user\",\n    \"content\": [\n      {\"type\": \"text\", \"text\": \"Please analyze this clip.\"},\n      {\n        \"type\": \"input_audio\",\n        \"input_audio\": {\n          \"data\": \"AAA...\", \n          \"format\": \"mp3\"\n        }\n      }\n    ]\n  }\n]\n```\n\nSupported formats include MP3, WAV, AAC, OGG, FLAC. For WebSocket (Live), only PCM is supported at this time; non-PCM will be rejected with a clear message.\n```\n\n#### WebSocket for Gemini Live (Fixed in v0.3.3)\n\n```python\nfrom fastapi import FastAPI, WebSocket, WebSocketDisconnect\nfrom google import genai\nfrom google.genai import types\nimport asyncio\n\napp = FastAPI()\n\n@app.websocket(\"/api/gemini/ws\")\nasync def gemini_websocket(websocket: WebSocket):\n    await websocket.accept()\n    \n    # Create Gemini client\n    client = genai.Client(\n        http_options={\"api_version\": \"v1beta\"},\n        api_key=\"your-gemini-api-key\"\n    )\n    \n    # Configure for text (audio requires PCM format)\n    config = types.LiveConnectConfig(\n        response_modalities=[\"TEXT\"],\n        generation_config=types.GenerationConfig(\n            temperature=0.7,\n            max_output_tokens=1000\n        )\n    )\n    \n    # Connect using async context manager\n    async with client.aio.live.connect(\n        model=\"gemini-2.0-flash-exp\",\n        config=config\n    ) as session:\n        \n        # Handle bidirectional communication\n        async def receive_from_client():\n            async for message in websocket.iter_json():\n                if message[\"type\"] in [\"text\", \"message\"]:\n                    text = message.get(\"data\", {}).get(\"text\", \"\")\n                    if text:\n                        await session.send(input=text, end_of_turn=True)\n        \n        async def receive_from_gemini():\n            while True:\n                turn = session.receive()\n                async for response in turn:\n                    if hasattr(response, 'server_content'):\n                        content = response.server_content\n                        if hasattr(content, 'model_turn'):\n                            for part in content.model_turn.parts:\n                                if hasattr(part, 'text') and part.text:\n                                    await websocket.send_json({\n                                        \"type\": \"response\",\n                                        \"text\": part.text\n                                    })\n        \n        # Run both tasks concurrently\n        task1 = asyncio.create_task(receive_from_client())\n        task2 = asyncio.create_task(receive_from_gemini())\n        \n        # Wait for either to complete\n        done, pending = await asyncio.wait(\n            [task1, task2],\n            return_when=asyncio.FIRST_COMPLETED\n        )\n        \n        # Clean up\n        for task in pending:\n            task.cancel()\n```\n\n**Try the HTML Demo:**\n```bash\n# Start the FastAPI server\nuv run main.py\n\n# Open the HTML demo in your browser\nopen examples/gemini_live_demo.html\n```\n\nThe demo provides a full-featured chat interface with WebSocket connection to Gemini Live.\n\nNote on audio (WebSocket):\n- Audio input is supported for PCM only (16-bit PCM). Send base64-encoded PCM and it will be forwarded to Gemini Live.\n- Non-PCM inputs (e.g., WebM/Opus) are rejected with: \"Audio requires PCM format - WebM conversion not yet implemented\".\n- Example client payloads:\n  - Raw base64 string: {\"type\": \"audio\", \"data\": \"\u003cbase64_pcm\u003e\"}\n  - Object form: {\"type\": \"audio\", \"data\": {\"base64\": \"\u003cbase64_pcm\u003e\", \"mime_type\": \"audio/pcm\"}}\n\n## Features\n\n### 🚀 **Unified Interface**\n- **Single client for all providers** - No more provider-specific code\n- **Automatic provider routing** - Detects provider from model name\n- **Intelligent model selection** - Find best model based on requirements\n- **Zero-config setup** - Auto-detects available providers from environment\n\n### 🧠 **Model Management**\n- **Cross-provider model discovery** - List models from OpenAI, Gemini, Ollama\n- **Rich model metadata** - Context limits, capabilities, multimodal support\n- **Automatic model provisioning** - Downloads Ollama models as needed\n- **Model compatibility checking** - Ensures models support requested features\n\n### 🔧 **Developer Experience**\n- **No framework dependencies** - Use with FastAPI, Flask, or any Python app\n- **Async/await support** - Modern async Python\n- **Type hints** - Full type annotations\n- **Easy testing** - Mock the unified client in your tests\n- **Backward compatible** - All existing provider-specific code continues to work\n\n### 🎯 **Advanced Features**\n- **WebSocket support** - Real-time audio/text streaming with Gemini Live\n- **Built-in tools** - Code execution and Google Search with simple flags\n- **Custom functions** - Add your own function declarations\n- **Optional telemetry** - OpenTelemetry integration for production monitoring\n- **Provider-specific optimizations** - Access advanced features when needed\n\n### Telemetry\n\nBasic observability with OpenTelemetry (optional):\n\n```python\n# Install with: pip install \"ai-proxy-core[telemetry]\"\n\n# Enable telemetry via environment variables\nexport OTEL_ENABLED=true\nexport OTEL_EXPORTER_TYPE=console  # or \"otlp\" for production\nexport OTEL_ENDPOINT=localhost:4317  # for OTLP exporter\n\n# Automatic telemetry for:\n# - Request counts by model/status\n# - Request latency tracking\n# - Session duration for WebSockets\n# - Error tracking with types\n```\n\nThe telemetry is completely optional and has zero overhead when disabled.\n\n## Project Structure\n\n\u003e 📝 **Note:** Full documentation of the project structure is being tracked in [Issue #12](https://github.com/ebowwa/ai-proxy-core/issues/12)\n\nThis project serves dual purposes:\n- **Python Library** (`/ai_proxy_core`): Installable via pip for use in Python applications\n- **Web Service** (`/api`): FastAPI endpoints for REST API access\n\n## Development\n\n### Releasing New Versions\n\nWe provide an automated release script that handles version bumping, building, and publishing:\n\n```bash\n# Make the script executable (first time only)\nchmod +x release.sh\n\n# Release a new version\n./release.sh 0.1.9\n```\n## Client identification and IP fallback\n\nTo attribute requests to a product/app or device, the API accepts optional client metadata on both REST and WebSocket paths. If client_id is not provided, the server uses the client IP as a fallback (works for curl/CLI users too).\n\n- Optional fields (REST body and WS config message):\n  - app\n  - client_id\n  - device\n  - user_id\n  - session_id\n  - request_id\n\n- Optional HTTP headers (used when body fields are absent):\n  - X-App\n  - X-Client-Id\n  - X-Device\n  - X-User-Id\n  - X-Session-Id\n  - X-Request-Id\n\n- IP resolution order:\n  1) X-Forwarded-For (first IP)\n  2) Forwarded header (for= token, supports quotes and IPv6 [brackets])\n  3) X-Real-IP\n  4) Socket peer address\n\n- Precedence:\n  - Body values override headers.\n  - If client_id is missing, it defaults to the resolved IP.\n\n- Telemetry:\n  - Providers tag counters/durations with:\n    - client.app\n    - client.device\n    - client.id\n    - client.ip\n\n### REST example (no client_id provided)\n```bash\ncurl -X POST http://localhost:8000/api/chat/completions \\\n  -H 'Content-Type: application/json' \\\n  -d '{\"model\":\"gemini-1.5-flash\",\"messages\":[{\"role\":\"user\",\"content\":\"Hello\"}]}'\n```\nIf behind a proxy, include X-Forwarded-For or X-Real-IP so the correct source IP is used.\n\n### WebSocket example config\nAfter connecting to ws://localhost:8000/api/gemini/ws send:\n```json\n{\"type\":\"config\",\"app\":\"caringmind\",\"device\":\"cli\"}\n```\nIf no client_id is present, the server computes it from the IP and acknowledges with:\n```json\n{\"type\":\"config_success\",\"message\":\"Configuration acknowledged\",\"client_id\":\"\u003cderived-ip\u003e\",\"ip\":\"\u003cderived-ip\u003e\"}\n```\n\nThe script will:\n1. Show current version and validate the new version format\n2. Prompt for a release description (for CHANGELOG)\n3. Update version in all necessary files (pyproject.toml, setup.py, __init__.py)\n4. Update CHANGELOG.md with your description\n5. Build the package\n6. Upload to PyPI\n7. Commit changes and create a git tag\n8. Push to GitHub with the new tag\n\n### Manual Build Process\n\nIf you prefer to build manually:\n\n```bash\nuv run python setup.py sdist bdist_wheel\ntwine upload dist/*\n```\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Febowwa%2Fai-proxy-core","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Febowwa%2Fai-proxy-core","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Febowwa%2Fai-proxy-core/lists"}