{"id":29263715,"url":"https://github.com/attach-dev/attach-gateway","last_synced_at":"2025-10-10T07:40:52.728Z","repository":{"id":295810048,"uuid":"975111199","full_name":"attach-dev/attach-gateway","owner":"attach-dev","description":"Drop-in OIDC \u0026 Google A2A auth + Weaviate memory for Ollama, vLLM and any local LLM server.","archived":false,"fork":false,"pushed_at":"2025-08-13T00:59:13.000Z","size":6998,"stargazers_count":12,"open_issues_count":3,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-09-27T06:12:14.996Z","etag":null,"topics":["a2a","attach-dev","auth0","gateway","jwt","llm","memory","ollama","proxy","vllm","weaviate"],"latest_commit_sha":null,"homepage":"http://attach.dev","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/attach-dev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-29T19:55:23.000Z","updated_at":"2025-08-13T00:59:19.000Z","dependencies_parsed_at":"2025-06-03T17:30:31.950Z","dependency_job_id":"b3ecd81b-8178-4131-9367-2332dfef1b47","html_url":"https://github.com/attach-dev/attach-gateway","commit_stats":null,"previous_names":["attach-dev/attach-gateway"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/attach-dev/attach-gateway","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/attach-dev%2Fattach-gateway","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/attach-dev%2Fattach-gateway/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/attach-dev%2Fattach-gateway/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/attach-dev%2Fattach-gateway/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/attach-dev","download_url":"https://codeload.github.com/attach-dev/attach-gateway/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/attach-dev%2Fattach-gateway/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279003186,"owners_count":26083533,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-10T02:00:06.843Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["a2a","attach-dev","auth0","gateway","jwt","llm","memory","ollama","proxy","vllm","weaviate"],"created_at":"2025-07-04T12:00:56.173Z","updated_at":"2025-10-10T07:40:52.723Z","avatar_url":"https://github.com/attach-dev.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Attach Gateway\n\n\u003e **Identity \u0026 Memory side‑car** for every LLM engine and multi‑agent framework. Add OIDC / DID SSO, A2A hand‑off, and a pluggable memory bus (Weaviate today) – all with one process.\n\n[![PyPI](https://img.shields.io/pypi/v/attach-dev.svg)](https://pypi.org/project/attach-dev/)\n\n---\n\n## Why it exists\n\nLLM engines such as **Ollama** or **vLLM** ship with **zero auth**.  Agent‑to‑agent protocols (Google **A2A**, MCP, OpenHands) assume a *Bearer token* is already present but don't tell you how to issue or validate it.  Teams end up wiring ad‑hoc reverse proxies, leaking ports, and copy‑pasting JWT code.\n\n**Attach Gateway** is that missing resource‑server:\n\n*   ✅ Verifies **OIDC / JWT** or **DID‑JWT**\n*   ✅ Stamps `X‑Attach‑User` + `X‑Attach‑Session` headers so every downstream agent/tool sees the same identity\n*   ✅ Implements `/a2a/tasks/send` + `/tasks/status` for Google A2A \u0026 OpenHands hand‑off\n*   ✅ Mirrors prompts \u0026 responses to a memory backend (Weaviate Docker container by default)\n*   ✅ Workflow traces (Temporal)\n\nRun it next to any model server and get secure, shareable context in under 1 minute.\n\n---\n\n## 60‑second Quick‑start (local laptop)\n\n### Option 1: Install from PyPI (Recommended)\n\n```bash\n# 0) prerequisites: Python 3.12, Ollama installed, Auth0 account or DID token\n\n# Install the package\npip install attach-dev\n\n# 1) start memory in Docker (background tab)\n# Mac M1/M2 users: use manual Docker command (see examples/README.md)\ndocker run --rm -d -p 6666:8080 \\\n  -e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \\\n  semitechnologies/weaviate:1.30.5\n\n# 2) export your short‑lived token\nexport JWT=\"\u003cpaste Auth0 or DID token\u003e\"\nexport OIDC_ISSUER=https://YOUR_DOMAIN.auth0.com\nexport OIDC_AUD=ollama-local\nexport MEM_BACKEND=weaviate\nexport WEAVIATE_URL=http://127.0.0.1:6666\n\n# 3) run gateway\nattach-gateway --port 8080 \u0026\n\n# 4) make a protected Ollama call via the gateway\ncurl -H \"Authorization: Bearer $JWT\" \\\n     -d '{\"model\":\"tinyllama\",\"messages\":[{\"role\":\"user\",\"content\":\"hello\"}]}' \\\n    http://localhost:8080/api/chat | jq .\n```\n\n### Option 2: Install from Source\n\n```bash\n# 0) prerequisites: Python 3.12, Ollama installed, Auth0 account or DID token\n\ngit clone https://github.com/attach-dev/attach-gateway.git \u0026\u0026 cd attach-gateway\npython -m venv .venv \u0026\u0026 source .venv/bin/activate\npip install -r requirements.txt\n\n\n# 1) start memory in Docker (background tab)\n\npython script/start_weaviate.py \u0026\n\n# 2) export your short‑lived token\nexport JWT=\"\u003cpaste Auth0 or DID token\u003e\"\nexport OIDC_ISSUER=https://YOUR_DOMAIN.auth0.com\nexport OIDC_AUD=ollama-local\nexport MEM_BACKEND=weaviate\nexport WEAVIATE_URL=http://127.0.0.1:6666\n\n# 3) run gateway\nuvicorn main:app --port 8080 \u0026\n\n# The gateway exposes your Auth0 credentials for the demo UI at\n# `/auth/config`. The values are read from `AUTH0_DOMAIN`,\n# `AUTH0_CLIENT` and `OIDC_AUD`.\n\n# 4) make a protected Ollama call via the gateway\ncurl -H \"Authorization: Bearer $JWT\" \\\n     -d '{\"model\":\"tinyllama\",\"messages\":[{\"role\":\"user\",\"content\":\"hello\"}]}' \\\n    http://localhost:8080/api/chat | jq .\n```\n\nIn another terminal, try the Temporal demo:\n\n```bash\npip install temporalio  # optional workflow engine\npython examples/temporal_adapter/worker.py \u0026\npython examples/temporal_adapter/client.py\n```\n\nYou should see a JSON response plus `X‑ATTACH‑Session‑Id` header – proof the pipeline works.\n\n---\n\n## Use in your project\n\n1. Copy `.env.example` → `.env` and fill in OIDC + backend URLs  \n2. `pip install attach-dev python-dotenv`  \n3. `attach-gateway` (reads .env automatically)\n\n→ See **[docs/configuration.md](docs/configuration.md)** for framework integration and **[examples/](examples/)** for code samples.\n\n---\n\n## Architecture (planner → coder hand‑off)\n\n```mermaid\nflowchart TD\n    %%────────────────────────────────\n    %%  COMPONENTS  \n    %%────────────────────────────────\n    subgraph Front-end\n        UI[\"Browser\u003cbr/\u003e demo.html\"]\n    end\n\n    subgraph Gateway\n        GW[\"Attach Gateway\u003cbr/\u003e (OIDC SSO + A2A)\"]\n    end\n\n    subgraph Agents\n        PL[\"Planner Agent\u003cbr/\u003eFastAPI :8100\"]\n        CD[\"Coder Agent\u003cbr/\u003eFastAPI :8101\"]\n    end\n\n    subgraph Memory\n        WV[\"Weaviate (Docker)\\nclass MemoryEvent\"]\n    end\n\n    subgraph Engine\n        OL[\"Ollama / vLLM\u003cbr/\u003e:11434\"]\n    end\n\n    %%────────────────────────────────\n    %%  USER FLOW\n    %%────────────────────────────────\n    UI -- ① POST /a2a/tasks/send\u003cbr/\u003eBearer JWT, prompt --\u003e GW\n\n    %%─ Planner hop\n    GW -- ② Proxy → planner\u003cbr/\u003e(X-Attach-User, Session) --\u003e PL\n    PL -- ③ Write plan doc --\u003e WV\n    PL -- ④ /a2a/tasks/send\\nbody:{mem_id} --\u003e GW\n\n    %%─ Coder hop\n    GW -- ⑤ Proxy → coder --\u003e CD\n    CD -- ⑥ GET plan by mem_id --\u003e WV\n    CD -- ⑦ POST /api/chat\\nprompt(plan) --\u003e GW\n    GW -- ⑧ Proxy → Ollama --\u003e OL\n    OL -- ⑨ JSON response --\u003e GW\n    GW -- ⑩ Write response to Weaviate --\u003e WV\n    GW -- ⑪ /a2a/tasks/status = done --\u003e UI\n```\n\n**Key headers**\n\n| Header | Meaning |\n|--------|---------|\n| `Authorization: Bearer \u003cJWT\u003e` | OIDC or DID token proved by gateway |\n| `X‑Attach‑User` | stable user ID (`auth0|123` or `did:pkh:…`) |\n| `X‑Attach‑Session` | deterministic hash (user + UA) for request trace |\n\n---\n\n## Live two‑agent demo\n\n```bash\n\n# pane 1 – memory (Docker)\npython script/start_weaviate.py\n\n# pane 2 – gateway\nuvicorn main:app --port 8080\n\n# pane 3 – planner agent\nuvicorn examples.agents.planner:app --port 8100\n\n# pane 4 – coder agent\nuvicorn examples.agents.coder:app   --port 8101\n\n# pane 5 – static chat UI\ncd examples/static \u0026\u0026 python -m http.server 9000\nopen http://localhost:9000/demo.html\n```\nType a request like *\"Write Python to sort a list.\"*  The browser shows:\n1. Planner message   → logged in gateway, plan row appears in memory.\n2. Coder reply       → code response, second memory row, status `done`.\n\n---\n\n## Directory map\n\n| Path | Purpose |\n|------|---------|\n| `auth/` | OIDC \u0026 DID‑JWT verifiers |\n| `middleware/` | JWT middleware, session header, mirror trigger |\n| `a2a/` | `/tasks/send` \u0026 `/tasks/status` routes |\n| `mem/` | pluggable memory writers (`weaviate.py`, `sakana.py`) |\n| `proxy/` | Engine-agnostic HTTP proxy logic |\n| `examples/agents/` | *examples* – Planner \u0026 Coder FastAPI services |\n| `examples/static/` | `demo.html` chat page |\n\n---\n\n## Auth core\n\n`auth.verify_jwt()` accepts three token formats and routes them automatically:\n\n1. Standard OIDC JWTs\n2. `did:key` tokens\n3. `did:pkh` tokens\n\nExample DID-JWT request:\n```bash\ncurl -X POST /v1/resource \\\n     -H \"Authorization: Bearer did:key:z6Mki...\u003csig\u003e.\u003cpayload\u003e.\u003csig\u003e\"\n```\n\nExample OIDC JWT request:\n```bash\ncurl -X POST /v1/resource \\ \n     -H \"Authorization: Bearer $JWT\"\n```\n\n### Other IdPs (Descope)\nTo use your OIDC provider, set the following environment variables:\n```bash\nOIDC_ISSUER=\u003cyour-oidc-issuer\u003e\nOIDC_AUD=\u003cyour-oidc-audience\u003e\nAUTH_BACKEND=\u003cyour-authentication-provider\u003e\n```\nDescope-specific environment variables:\n```bash\nDESCOPE_BASE_URL=https://api.descope.com or \u003cyour-descope-base-url\u003e\nDESCOPE_PROJECT_ID=\u003cyour-descope-project-id\u003e\nDESCOPE_CLIENT_ID=\u003cyour-descope-inbound-app-client-id\u003e\nDESCOPE_CLIENT_SECRET=\u003cyour-descope-inbound-app-client-secret\u003e\n```\n\n## 💾 Memory: logs\n\nSend Sakana-formatted logs to the gateway and they will be stored as\n`MemoryEvent` objects in Weaviate.\n\n```bash\ncurl -X POST /v1/logs \\\n     -H \"Authorization: Bearer $JWT\" \\\n     -d '{\"run_id\":\"abc\",\"level\":\"info\",\"message\":\"hi\"}'\n# =\u003e HTTP/1.1 202 Accepted\n```\n\n## Usage hooks\n\nEmit token usage metrics for every request. Choose a backend via\n`USAGE_METERING` (alias `USAGE_BACKEND`):\n\n```bash\nexport USAGE_METERING=prometheus  # or null\n```\n\nA Prometheus counter `attach_usage_tokens_total{user,direction,model}` is\nexposed for Grafana dashboards.\nSet `USAGE_METERING=null` (the default) to disable metering entirely.\n\n\u003e **⚠️ Usage hooks depend on the quota middleware.**  \n\u003e Make sure `MAX_TOKENS_PER_MIN` is set (any positive number) so the  \n\u003e `TokenQuotaMiddleware` is enabled; the middleware is what records usage  \n\u003e events that feed Prometheus.\n\n```bash\n# Enable usage tracking (set any reasonable limit)\nexport MAX_TOKENS_PER_MIN=60000\nexport USAGE_METERING=prometheus\n```\n\n#### OpenMeter (Stripe / ClickHouse)\n\n```bash\n# No additional dependencies needed - uses direct HTTP API\nexport MAX_TOKENS_PER_MIN=60000              # Required: enables quota middleware\nexport USAGE_METERING=openmeter              # Required: activates OpenMeter backend  \nexport OPENMETER_API_KEY=your-api-key-here   # Required: API authentication\nexport OPENMETER_URL=https://openmeter.cloud # Optional: defaults to https://openmeter.cloud\n```\n\nEvents are sent directly to OpenMeter's HTTP API and are processed by the LLM tokens meter for billing integration with Stripe.\n\n\u003e **⚠️ All three variables are required for OpenMeter to work:**  \n\u003e - `MAX_TOKENS_PER_MIN` enables the quota middleware that records usage events  \n\u003e - `USAGE_METERING=openmeter` activates the OpenMeter backend  \n\u003e - `OPENMETER_API_KEY` provides authentication to OpenMeter's API  \n\nThe gateway gracefully falls back to `NullUsageBackend` if any required variable is missing.\n\n### Scraping metrics\n\n```bash\ncurl -H \"Authorization: Bearer $JWT\" http://localhost:8080/metrics\n```\n\n## Token quotas\n\nAttach Gateway can enforce per-user token limits. Install the optional\ndependency with `pip install attach-dev[quota]` and set\n`MAX_TOKENS_PER_MIN` in your environment to enable the middleware. The\ncounter defaults to the `cl100k_base` encoding; override with\n`QUOTA_ENCODING` if your model uses a different tokenizer. The default\nin-memory store works in a single process and is not shared between\nworkers—requests retried across processes may be double-counted. Use Redis\nfor production deployments.\nIf `tiktoken` is missing, a byte-count fallback is used which counts about\nfour times more tokens than the `cl100k` tokenizer – install `tiktoken` in\nproduction.\n\n### Enable token quotas\n\n```bash\n# Optional: Enable token quotas\nexport MAX_TOKENS_PER_MIN=60000\npip install tiktoken  # or pip install attach-dev[quota]\n```\n\nTo customize the tokenizer:\n```bash\nexport QUOTA_ENCODING=cl100k_base  # default\n```\n\n---\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fattach-dev%2Fattach-gateway","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fattach-dev%2Fattach-gateway","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fattach-dev%2Fattach-gateway/lists"}