{"id":44378805,"url":"https://github.com/sequenzia/mamba-server","last_synced_at":"2026-02-11T21:36:56.617Z","repository":{"id":334924361,"uuid":"1142684035","full_name":"sequenzia/mamba-server","owner":"sequenzia","description":"A production-ready FastAPI with Pydantic AI backend service that provides OpenAI-powered chat completions with streaming support, designed for seamless integration with the Vercel AI SDK.","archived":false,"fork":false,"pushed_at":"2026-01-27T19:14:04.000Z","size":315,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-01-28T01:45:36.474Z","etag":null,"topics":["ai-backend","fastapi","pydantic-ai"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sequenzia.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-01-26T18:17:19.000Z","updated_at":"2026-01-27T19:14:08.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/sequenzia/mamba-server","commit_stats":null,"previous_names":["sequenzia/mamba-server"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/sequenzia/mamba-server","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequenzia%2Fmamba-server","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequenzia%2Fmamba-server/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequenzia%2Fmamba-server/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequenzia%2Fmamba-server/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sequenzia","download_url":"https://codeload.github.com/sequenzia/mamba-server/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequenzia%2Fmamba-server/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29345692,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-11T20:11:40.865Z","status":"ssl_error","status_checked_at":"2026-02-11T20:10:41.637Z","response_time":97,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-backend","fastapi","pydantic-ai"],"created_at":"2026-02-11T21:36:55.170Z","updated_at":"2026-02-11T21:36:56.610Z","avatar_url":"https://github.com/sequenzia.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Mamba Server\n\n[![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/)\n[![FastAPI](https://img.shields.io/badge/FastAPI-0.115+-green.svg)](https://fastapi.tiangolo.com/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\nA production-ready FastAPI with Pydantic AI backend service that provides OpenAI-powered chat completions with streaming support, designed for seamless integration with the Vercel AI SDK.\n\n## Overview\n\nMamba Server wraps OpenAI models using Pydantic AI's Agent-based architecture, delivering real-time streaming responses via Server-Sent Events (SSE). Built with a layered architecture and streaming-first design, it provides a robust foundation for AI-powered chat applications.\n\n### Key Features\n\n- **Streaming Chat Completions** - Real-time responses via Server-Sent Events (SSE)\n- **Vercel AI SDK Compatible** - Native support for Vercel AI SDK message format\n- **Mamba Agents Integration** - Route requests to specialized pre-configured agents (research, code review)\n- **Display Tools** - 4 built-in tools: `generateForm`, `generateChart`, `generateCode`, `generateCard`\n- **Flexible Authentication** - Support for none, API key, or JWT authentication\n- **Kubernetes Ready** - Health checks, liveness/readiness probes, and Helm-ready manifests\n- **Resilient** - Exponential backoff retry for OpenAI API calls\n- **Multi-source Configuration** - Environment variables, env file, and YAML file support\n\n## Quick Start\n\n### Prerequisites\n\n- Python 3.12 or higher\n- [UV](https://github.com/astral-sh/uv) package manager (recommended) or pip\n- OpenAI API key\n\n### Installation\n\n```bash\n# Clone the repository\ngit clone https://github.com/your-org/mamba-server.git\ncd mamba-server\n\n# Install dependencies with UV\nuv sync\n\n# Or with pip\npip install -e .\n```\n\n### Running the Server\n\n```bash\n# Set your OpenAI API key\nexport OPENAI_API_KEY=\"your-api-key\"\n\n# Start the server\nuv run uvicorn mamba.main:app --reload\n\n# Server runs at http://localhost:8000\n```\n\n## Configuration\n\nMamba Server supports multiple configuration sources with the following precedence (highest to lowest):\n\n1. Environment variables with `MAMBA_` prefix\n2. `~/mamba.env` file (user home directory)\n3. `config.local.yaml` (optional, git-ignored)\n4. `config/config.yaml`\n5. Code defaults\n\n### Environment Variables\n\n| Variable | Description | Default |\n|----------|-------------|---------|\n| `OPENAI_API_KEY` | OpenAI API key (required) | - |\n| `OPENAI_API_BASE_URL` | Custom OpenAI API base URL | `https://api.openai.com/v1` |\n| `MAMBA_SERVER__HOST` | Server bind address | `0.0.0.0` |\n| `MAMBA_SERVER__PORT` | Server port | `8000` |\n| `MAMBA_OPENAI__DEFAULT_MODEL` | Default OpenAI model | `gpt-4o` |\n| `MAMBA_AUTH__MODE` | Auth mode: `none`, `api_key`, `jwt` | `none` |\n| `MAMBA_LOGGING__LEVEL` | Log level: `DEBUG`, `INFO`, `WARNING`, `ERROR` | `INFO` |\n\nUse `__` as the nested delimiter (e.g., `MAMBA_OPENAI__TIMEOUT_SECONDS`).\n\n### Env File Configuration\n\nCreate a `~/mamba.env` file in your home directory for persistent settings:\n\n```bash\nOPENAI_API_KEY=your-api-key\nMAMBA_AUTH__MODE=api_key\n```\n\n### YAML Configuration\n\nCreate a `config.local.yaml` in the project root for local overrides:\n\n```yaml\nserver:\n  host: \"127.0.0.1\"\n  port: 8000\n\nopenai:\n  default_model: \"gpt-4o\"\n\nauth:\n  mode: \"api_key\"\n  api_keys:\n    - key: \"your-secret-key\"\n      name: \"dev-key\"\n```\n\n## API Documentation\n\n### Endpoints\n\n| Endpoint | Method | Description |\n|----------|--------|-------------|\n| `/chat` | POST | Streaming chat completions via SSE (supports `agent` param) |\n| `/title/generate` | POST | Generate conversation titles |\n| `/models` | GET | List available models |\n| `/health` | GET | Full health check (dependencies included) |\n| `/health/live` | GET | Liveness probe for Kubernetes |\n| `/health/ready` | GET | Readiness probe for Kubernetes |\n\n### Chat Completions\n\n```bash\ncurl -X POST http://localhost:8000/chat \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"messages\": [\n      {\"role\": \"user\", \"content\": \"Hello, how are you?\"}\n    ]\n  }'\n```\n\n**Request Body:**\n\n```json\n{\n  \"messages\": [\n    {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n    {\"role\": \"user\", \"content\": \"Hello!\"}\n  ],\n  \"model\": \"gpt-4o\"\n}\n```\n\n**Response:** Server-Sent Events stream with Vercel AI SDK format.\n\n### Interactive Documentation\n\nOnce running, access the auto-generated API docs:\n\n- Swagger UI: http://localhost:8000/docs\n- ReDoc: http://localhost:8000/redoc\n\n## Mamba Agents\n\nMamba Server supports routing requests to specialized pre-configured agents via the `agent` parameter. This allows leveraging purpose-built agents with their own tool sets and system prompts.\n\n### Available Agents\n\n| Agent | Purpose | Tools |\n|-------|---------|-------|\n| `research` | Information gathering and synthesis | Search tools |\n| `code_review` | Code analysis and quality assessment | Complexity metrics tools |\n\n### Usage\n\nInclude the `agent` parameter in your chat completion request:\n\n```bash\ncurl -X POST http://localhost:8000/chat \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"messages\": [\n      {\"role\": \"user\", \"content\": \"Research the latest trends in AI\"}\n    ],\n    \"model\": \"openai/gpt-4o\",\n    \"agent\": \"research\"\n  }'\n```\n\n### Behavior\n\n- **`agent: null`** or omitted - Uses standard ChatAgent flow (backward compatible)\n- **`agent: \"research\"`** - Routes to research agent (ignores `tools` parameter)\n- **`agent: \"code_review\"`** - Routes to code review agent\n- **Invalid agent name** - Returns an error event with list of available agents\n\nWhen using Mamba Agents, the `tools` parameter in the request is ignored as each agent comes with its own pre-configured tool set.\n\n## Development\n\n### Setup\n\n```bash\n# Install with dev dependencies\nuv sync --all-extras\n\n# Or with pip\npip install -e \".[dev]\"\n```\n\n### Code Quality\n\n```bash\n# Run linter\nuv run ruff check src tests\n\n# Run formatter\nuv run ruff format src tests\n\n# Auto-fix issues\nuv run ruff check --fix src tests\n```\n\n### Testing\n\n```bash\n# Run all tests\nuv run pytest\n\n# Run with coverage\nuv run pytest --cov=mamba\n\n# Run specific test file\nuv run pytest tests/unit/core/test_agent.py -v\n```\n\n## Deployment\n\n### Docker\n\n```bash\n# Build the image\ndocker build -t mamba-server:latest .\n\n# Run the container\ndocker run -p 8000:8000 \\\n  -e OPENAI_API_KEY=\"your-api-key\" \\\n  mamba-server:latest\n```\n\nThe Dockerfile uses a multi-stage build with Python 3.12-slim and runs as a non-root user (`mamba:1000`).\n\n### Kubernetes\n\nKubernetes manifests are provided in the `/k8s/` directory:\n\n```bash\n# Apply manifests\nkubectl apply -f k8s/\n\n# Or use kustomize\nkubectl apply -k k8s/\n```\n\nHealth probes are pre-configured:\n- **Liveness:** `/health/live`\n- **Readiness:** `/health/ready`\n\n## Project Structure\n\n```\nmamba-server/\n├── src/mamba/\n│   ├── api/                   # HTTP layer\n│   │   ├── handlers/          # Endpoint handlers (chat.py, health.py, models.py, title.py)\n│   │   ├── deps.py            # FastAPI dependency injection\n│   │   └── routes.py          # Route registration\n│   ├── core/                  # Business logic\n│   │   ├── agent.py           # ChatAgent - Pydantic AI wrapper\n│   │   ├── mamba_agent.py     # Mamba Agents framework adapter\n│   │   ├── streaming.py       # SSE encoding, timeout handling\n│   │   ├── messages.py        # Message format conversion\n│   │   ├── tools.py           # Tool definitions (forms, charts, code, cards)\n│   │   ├── tool_schema.py     # OpenAI function format conversion\n│   │   └── title_utils.py     # Title processing utilities\n│   ├── middleware/            # Request processing chain\n│   │   ├── auth.py            # Auth modes: none, api_key, jwt\n│   │   ├── logging.py         # Structured request/response logging\n│   │   └── request_id.py      # X-Request-ID propagation\n│   ├── models/                # Pydantic schemas\n│   │   ├── events.py          # StreamEvent discriminated union\n│   │   ├── request.py         # ChatCompletionRequest, UIMessage\n│   │   ├── response.py        # ModelsResponse\n│   │   ├── health.py          # HealthResponse, ComponentHealth\n│   │   └── title.py           # TitleGenerationRequest/Response\n│   ├── utils/                 # Utilities\n│   │   ├── errors.py          # ErrorCode enum, error classification\n│   │   └── retry.py           # @with_retry decorator (exponential backoff)\n│   ├── config.py              # Settings management (multi-source)\n│   └── main.py                # FastAPI app factory, middleware setup\n├── tests/\n│   ├── unit/                  # Unit tests (25 test files)\n│   ├── integration/           # Integration tests\n│   └── e2e/                   # End-to-end tests\n├── k8s/                       # Kubernetes manifests\n│   ├── deployment.yaml        # Deployment with health probes\n│   ├── service.yaml           # Service definition\n│   ├── configmap.yaml         # Configuration\n│   ├── secrets.yaml.example   # Secret template\n│   ├── pdb.yaml               # Pod disruption budget\n│   └── kustomization.yaml     # Kustomize config\n├── config/\n│   └── config.yaml            # Default configuration\n├── Dockerfile                 # Multi-stage production build\n└── pyproject.toml             # Project configuration\n```\n\n## Tech Stack\n\n| Category | Technologies |\n|----------|--------------|\n| Language | Python 3.12+ |\n| Framework | FastAPI 0.115+, Pydantic 2.10+ |\n| AI | Pydantic AI 0.0.49+, OpenAI |\n| HTTP | httpx 0.28+, Uvicorn 0.32+ |\n| Testing | pytest, pytest-asyncio, respx |\n| Linting | Ruff |\n| Build | Hatchling, UV |\n\n## Architecture\n\nMamba Server uses a **Layered/N-Tier architecture** with a **streaming-first design**:\n\n```\n┌─────────────────────────────────────────────────────────────┐\n│                    Client Request                           │\n└────────────────────────────┬────────────────────────────────┘\n                             │\n┌────────────────────────────▼────────────────────────────────┐\n│                  Middleware Chain                           │\n│         CORS → RequestID → Logging → Auth                   │\n└────────────────────────────┬────────────────────────────────┘\n                             │\n┌────────────────────────────▼────────────────────────────────┐\n│                    API Layer                                │\n│              (handlers, routes, deps)                       │\n└────────────────────────────┬────────────────────────────────┘\n                             │\n┌────────────────────────────▼────────────────────────────────┐\n│                   Core Layer                                │\n│    (ChatAgent, MambaAgentAdapter, streaming, tools)         │\n└────────────────────────────┬────────────────────────────────┘\n                             │\n┌────────────────────────────▼────────────────────────────────┐\n│                  External APIs                              │\n│                  (OpenAI via pydantic-ai)                   │\n└─────────────────────────────────────────────────────────────┘\n```\n\n**Key Design Patterns:**\n- **Factory Pattern** - `create_app()`, `create_agent()`, `create_streaming_response()` for testability\n- **Discriminated Unions** - Type-safe `StreamEvent` and `MessagePart` with Pydantic discriminators\n- **Adapter Pattern** - Message format conversion between UI and OpenAI formats; MambaAgentAdapter for framework integration\n- **Decorator Pattern** - `@with_retry()` for exponential backoff on failures\n- **Dependency Injection** - FastAPI `Depends()` with `Annotated` types\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsequenzia%2Fmamba-server","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsequenzia%2Fmamba-server","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsequenzia%2Fmamba-server/lists"}