An open API service indexing awesome lists of open source software.

https://github.com/icoretech/codex-pooler

🤖 Pool and route Codex accounts behind one Gateway
https://github.com/icoretech/codex-pooler

api-gateway beam codex codex-cli elixir erlang gateway hermes-agent mcp mcp-server model-context-protocol openai openai-api openclaw opencode phoenix sse-streaming websocket

Last synced: 5 days ago
JSON representation

🤖 Pool and route Codex accounts behind one Gateway

Awesome Lists containing this project

README

          

Codex Pooler


One gateway for many Codex accounts.

Pool capacity, preserve sessions, route requests, and expose stable API keys
for agents and tools.


Quick start
·
Harness
·
Configuration
·
Deployment


Codex Pooler gateway overview




Codex Pooler upstream account readiness


Upstreams



Codex Pooler Pool dashboard


Pools



Codex Pooler request logs


Request logs

Codex Pooler is a self-hosted gateway for sharing Codex account capacity across
agents, tools, and teams.

Instead of binding each client to one Codex account, you add accounts to Pools
and issue stable Pool API keys. Clients send familiar Codex backend or
OpenAI-compatible requests; Codex Pooler selects the right account based on
model support, limits, session continuity, routing policy, and health.

Operators get one place to manage accounts, keys, routing, request accounting,
audit logs, and health without storing prompts, files, audio, images, bearer
tokens, or raw Codex secrets. Instance owners keep the global administration
surface, while instance admins work only with their assigned Pools.

## Highlights

- **One key for many accounts:** group Codex accounts into Pools and give
clients stable Pool API keys instead of binding each tool to one account
- **Smarter capacity sharing:** route each request to an eligible account with
available limits, matching model support, health, session state, and Pool
policy
- **Codex backend compatibility:** point Codex-compatible clients at Codex
Pooler and keep responses, compacting, usage, files, audio, images, and
backend websocket flows working through pooled accounts
- **OpenAI-compatible SDK surface:** let `/v1`-only apps and agent tools use
multiple Codex subscriptions behind one gateway, with supported requests
translated and routed through Codex capacity to help contain API spend
- **Session-aware websockets:** keep resumable Codex sessions and websocket
reconnects attached to the right upstream account without translating backend
websocket traffic through an HTTP compatibility layer
- **Prompt-cache locality:** use a transient `prompt_cache_key` to prefer the
same eligible upstream account for repeat stateless requests, improving
provider-side cache locality without storing prompts or responses locally
- **Operator dashboard:** manage Pool-scoped accounts, API keys, invites,
usage, request logs, audit logs, MCP access, and the owner-only jobs,
operators, and system settings surfaces
- **Privacy-minded observability:** store request, routing, and audit metadata
without storing prompts, file bodies, audio, images, bearer tokens, cookies,
raw Codex account tokens, or raw API keys
- **Configurable without code changes:** tune Pool policy, gateway defaults,
diagnostics, model support, limits, and operational settings from the admin UI
- **Built for self-hosting:** run on Elixir/Erlang's fault-tolerant runtime,
start locally with Docker Compose, or deploy the Helm chart with separate web,
worker, scheduler, and migration roles for Kubernetes-friendly, multinode
growth

## Harness Configuration

Keep Pool API keys in environment variables when the harness supports secret
expansion. The `/mcp` endpoint is an optional operator-only add-on for metadata
inspection; Codex Pooler runtime clients do not need it. If a desktop harness
persists remote MCP headers in its own private settings, use a dedicated
operator-scoped MCP token. For a local instance, the URLs are:

```text
Codex backend base URL: http://localhost:4000/backend-api/codex
OpenAI SDK base URL: http://localhost:4000/v1
Optional operator MCP URL: http://localhost:4000/mcp
```

For a deployed instance, replace `http://localhost:4000` with your deployed host,
for example `https://codex-pooler.example.com`.

opencode logo OpenCode ~/.config/opencode/opencode.jsonc

![Codex Pooler OpenCode integration](.github/assets/codex-pooler-opencode.png)

OpenCode talks to Codex Pooler through the OpenAI-compatible `/v1` surface. The
provider uses the Pool API key, and the optional remote MCP entry uses an
operator-owned MCP token. MCP is not required for OpenCode to use Codex Pooler;
it only gives an operator MCP host read-only metadata tools. Its websocket
support is the narrow Responses websocket route at `GET /v1/responses`, not
OpenAI Realtime SDK compatibility.

```jsonc
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"openai": {
"npm": "@ai-sdk/openai",
"name": "Codex Pooler",
"options": {
"baseURL": "http://localhost:4000/v1",
"apiKey": "{env:CODEX_POOLER_API_KEY}",
"reasoningEffort": "high",
"reasoningSummary": "auto",
"textVerbosity": "medium",
"include": ["reasoning.encrypted_content"],
"store": false
},
"models": {
"gpt-5.5": {
"id": "gpt-5.5",
"name": "GPT-5.5",
"family": "gpt",
"attachment": true,
"reasoning": true,
"tool_call": true,
"temperature": false,
"modalities": {
"input": ["text", "image"],
"output": ["text"]
},
"limit": {
"context": 400000,
"input": 256000,
"output": 128000
}
}
}
}
},
// Optional operator-only MCP metadata add-on. Omit for normal model/runtime use.
"mcp": {
"codex_pooler": {
"type": "remote",
"url": "http://localhost:4000/mcp",
"oauth": false,
"headers": {
"Authorization": "Bearer {env:CODEX_POOLER_MCP_KEY}"
},
"enabled": true,
"timeout": 30000
}
}
}
```

Define only models that your assigned Pool can serve. For deployed instances,
change `baseURL` to `https://codex-pooler.example.com/v1`; if you keep the optional
operator MCP entry, change its `url` to `https://codex-pooler.example.com/mcp`.

OpenAI logo Codex ~/.codex/config.toml

![Codex Pooler Codex CLI integration](.github/assets/codex-pooler-codex.png)

Codex should use the backend compatibility route, not the `/v1` SDK route.
Keep the provider `name` as `OpenAI`; Codex uses that value for provider-family
behavior even when the request is routed through Codex Pooler.

```toml
model = "gpt-5.5"
model_provider = "codex-pooler-ws"

[model_providers.codex-pooler-ws]
name = "OpenAI"
base_url = "http://localhost:4000/backend-api/codex"
env_key = "CODEX_POOLER_API_KEY"
wire_api = "responses"
supports_websockets = true
requires_openai_auth = true

[model_providers.codex-pooler-http]
name = "OpenAI"
base_url = "http://localhost:4000/backend-api/codex"
env_key = "CODEX_POOLER_API_KEY"
wire_api = "responses"
supports_websockets = false
requires_openai_auth = true

# Optional operator-only MCP metadata add-on. Omit for normal Codex runtime use.
[mcp_servers.codex_pooler]
url = "http://localhost:4000/mcp"
bearer_token_env_var = "CODEX_POOLER_MCP_KEY"
```

Use the websocket provider for normal Codex backend behavior, and keep the HTTP
provider when you need to force SSE-only coverage. For deployed instances,
change both `base_url` values to `https://codex-pooler.example.com/backend-api/codex`;
if you keep the optional operator MCP add-on, change its `url` to
`https://codex-pooler.example.com/mcp`.

Codex filters resumable conversations by `model_provider`. If you already have
sessions created with the built-in `openai` provider and want them to appear
under `codex-pooler-ws`, re-tag both the JSONL transcripts and the newer SQLite
state database. Run these with Codex closed; they edit local state in place. The
transcript rewrite scans the whole sessions directory and can take a while on
large installs. Set `TO_PROVIDER=codex-pooler-http` if you made the HTTP
provider your default.

```bash
set -eu

FROM_PROVIDER="openai"
TO_PROVIDER="codex-pooler-ws"

find ~/.codex/sessions -type f -name '*.jsonl' \
-exec perl -pi -e \
"s/\"model_provider\":\"${FROM_PROVIDER}\"/\"model_provider\":\"${TO_PROVIDER}\"/g" \
{} +

for db in ~/.codex/state_*.sqlite; do
[ -e "$db" ] || continue
sqlite3 "$db" \
"UPDATE threads SET model_provider = '${TO_PROVIDER}' WHERE model_provider = '${FROM_PROVIDER}';"
done
```

On Windows, run the same migration from PowerShell. This expects `sqlite3` to be
available on `PATH`.

```powershell
$ErrorActionPreference = "Stop"

$FromProvider = "openai"
$ToProvider = "codex-pooler-ws"
$CodexHome = Join-Path $HOME ".codex"

$FromJson = '"model_provider":"' + $FromProvider + '"'
$ToJson = '"model_provider":"' + $ToProvider + '"'

Get-ChildItem -Path (Join-Path $CodexHome "sessions") -Recurse -Filter "*.jsonl" |
ForEach-Object {
$Path = $_.FullName
$TempPath = "$Path.tmp"
$Reader = [System.IO.StreamReader]::new($Path)
$Writer = [System.IO.StreamWriter]::new(
$TempPath,
$false,
[System.Text.UTF8Encoding]::new($false)
)

try {
while (($Line = $Reader.ReadLine()) -ne $null) {
$Writer.WriteLine($Line.Replace($FromJson, $ToJson))
}
} finally {
$Reader.Dispose()
$Writer.Dispose()
}

Move-Item -Force $TempPath $Path
}

Get-ChildItem -Path $CodexHome -Filter "state_*.sqlite" |
ForEach-Object {
sqlite3 $_.FullName `
"UPDATE threads SET model_provider = '$ToProvider' WHERE model_provider = '$FromProvider';"
}
```

OpenClaw logo OpenClaw ~/.openclaw/openclaw.json

![Codex Pooler OpenClaw integration](.github/assets/codex-pooler-openclaw.png)

OpenClaw uses `openai/*` as the canonical OpenAI route. To keep that model name
while sending agent turns to Codex Pooler's OpenAI-compatible `/v1` surface,
point the OpenAI provider at Codex Pooler and use the current OpenClaw runtime id.

```json5
{
agents: {
defaults: {
model: { primary: "openai/gpt-5.5" },
},
},
models: {
mode: "merge",
providers: {
openai: {
baseUrl: "http://localhost:4000/v1",
apiKey: "${CODEX_POOLER_API_KEY}",
api: "openai-responses",
agentRuntime: { id: "openclaw" },
timeoutSeconds: 300,
models: [
{
id: "gpt-5.5",
name: "GPT-5.5 via Codex Pooler",
reasoning: true,
input: ["text", "image"],
contextWindow: 400000,
contextTokens: 256000,
maxTokens: 128000,
},
],
},
},
},
// Optional operator-only MCP metadata add-on. Omit for normal model/runtime use.
mcp: {
servers: {
codex_pooler: {
url: "http://localhost:4000/mcp",
transport: "streamable-http",
headers: {
Authorization: "Bearer ${CODEX_POOLER_MCP_KEY}",
},
},
},
},
}
```

Define only models that your assigned Pool can serve. For deployed instances,
change `baseUrl` to `https://codex-pooler.example.com/v1`; if you keep the optional
operator MCP add-on, change its `url` to `https://codex-pooler.example.com/mcp`.
If you prefer to keep Codex Pooler separate from OpenClaw's built-in OpenAI
provider behavior, use a custom provider id such as `codex-pooler/gpt-5.5`
instead. That follows OpenClaw's generic custom-provider shape, but tools that
look specifically for `openai/gpt-*` model refs will not see it as canonical
OpenAI.

Hermes Agent logo Hermes Agent ~/.hermes/config.yaml + auth.json

![Codex Pooler Hermes Agent integration](.github/assets/codex-pooler-hermes.png)

Hermes works best through its `openai-api` provider with the Responses transport
forced explicitly. This is the recommended Codex Pooler setup. Keep the Pool API
key in `~/.hermes/.env` and point the provider config at Codex Pooler's `/v1`
surface. The `mcp_servers` block is an optional operator-only add-on for
read-only metadata tools; Codex Pooler works without it.

```bash
OPENAI_API_KEY=
OPENAI_BASE_URL=http://localhost:4000/v1
# Optional operator-only MCP metadata add-on:
CODEX_POOLER_MCP_KEY=
```

```yaml
model:
default: gpt-5.5
provider: openai-api
base_url: http://localhost:4000/v1
api_mode: codex_responses
context_length: 400000
supports_vision: true

agent:
image_input_mode: native

# Optional operator-only MCP metadata add-on. Omit for model/runtime use.
mcp_servers:
codex_pooler:
url: http://localhost:4000/mcp
headers:
Authorization: "Bearer ${CODEX_POOLER_MCP_KEY}"
enabled: true
timeout: 120
connect_timeout: 15
```

Remote HTTP MCP servers require Hermes' `mcp` extra. If
`hermes mcp test codex_pooler` reports `mcp.client.streamable_http is not
available`, install MCP support into the Hermes environment, following the
[Hermes MCP Integration docs](https://hermes-agent.nousresearch.com/docs/user-guide/features/mcp),
and rerun the test.

Check the one-shot model path:

```bash
hermes -z 'Reply with exactly: hermes openai api ok' --ignore-rules
```

Hermes can also be made to use its `openai-codex` provider against Codex
Pooler. This is less direct because Hermes treats `openai-codex` as an OAuth
provider by default; add a Pool API key credential ahead of any existing
device-code credential and keep the entry's `base_url` on `/v1`. Use this only
when you specifically need Hermes' `openai-codex` credential-pool behavior; the
`openai-api` configuration above is the preferred setup. This variant stores
the key in `auth.json` because Hermes credential pools live there.

```bash
HERMES_CODEX_BASE_URL=http://localhost:4000/v1
# Optional operator-only MCP metadata add-on:
CODEX_POOLER_MCP_KEY=
```

```yaml
model:
default: gpt-5.5
provider: openai-codex
base_url: http://localhost:4000/v1
context_length: 400000
supports_vision: true

agent:
image_input_mode: native

# Optional operator-only MCP metadata add-on. Omit for model/runtime use.
mcp_servers:
codex_pooler:
url: http://localhost:4000/mcp
headers:
Authorization: "Bearer ${CODEX_POOLER_MCP_KEY}"
enabled: true
timeout: 120
connect_timeout: 15
```

```json
{
"active_provider": "openai-codex",
"credential_pool": {
"openai-codex": [
{
"label": "codex-pooler",
"auth_type": "api_key",
"priority": -10,
"source": "manual",
"access_token": "",
"base_url": "http://localhost:4000/v1"
}
]
}
}
```

For deployed instances, change the model URLs to
`https://codex-pooler.example.com/v1`; if you keep the optional operator MCP add-on,
change the MCP `url` to `https://codex-pooler.example.com/mcp`.

Aider logo Aider ~/.aider.conf.yml

Aider uses the OpenAI-compatible route with the `openai/` model prefix. Keep the
Pool API key in the environment and point Aider's OpenAI API base at Codex
Pooler's `/v1` surface.

```yaml
model: openai/gpt-5.5
openai-api-base: http://localhost:4000/v1
```

Smoke-test from a repository:

```bash
export OPENAI_API_KEY="$CODEX_POOLER_API_KEY"
aider --model openai/gpt-5.5 --message 'Reply with exactly: aider ok'
```

For deployed instances, change `openai-api-base` to
`https://codex-pooler.example.com/v1`.

Continue logo Continue ~/.continue/config.yaml

Continue can use Codex Pooler as an OpenAI-compatible provider by setting
`provider: openai`, `apiBase` to `/v1`, and the Pool API key as a Continue
secret. For `gpt-5*` models, Continue uses the Responses API by default.

```yaml
name: Codex Pooler
version: 1.0.0
schema: v1

models:
- name: GPT-5.5 via Codex Pooler
provider: openai
model: gpt-5.5
apiBase: http://localhost:4000/v1
apiKey: "${{ secrets.CODEX_POOLER_API_KEY }}"
roles:
- chat
- edit
- apply
- summarize
capabilities:
- tool_use
- image_input

# Optional operator-only MCP metadata add-on. Omit for model/runtime use.
mcpServers:
- name: codex_pooler
type: streamable-http
url: http://localhost:4000/mcp
requestOptions:
timeout: 30000
headers:
Authorization: "Bearer ${{ secrets.CODEX_POOLER_MCP_KEY }}"
```

For deployed instances, change `apiBase` to `https://codex-pooler.example.com/v1`;
if you keep the optional operator MCP add-on, change the MCP `url` to
`https://codex-pooler.example.com/mcp`.

Check the headless CLI path after saving the config:

```bash
export CODEX_POOLER_API_KEY=
npx -y @continuedev/cli@latest -p \
--config ~/.continue/config.yaml \
--silent \
'Reply with exactly: continue ok'
```

The Pool API key authenticates model requests. The MCP token authenticates only
the operator metadata endpoint.

Cline logo Cline ~/.cline + ~/.cline/mcp.json

Cline CLI accepts `openai` as shorthand for its OpenAI-compatible provider and
stores it as `openai-compatible`. Configure it with the Pool API key, the Codex
Pooler `/v1` base URL, and the model id that your assigned Pool can serve.

```bash
cline auth \
--provider openai \
--apikey "$CODEX_POOLER_API_KEY" \
--baseurl http://localhost:4000/v1 \
--modelid gpt-5.5
```

Check the headless CLI path after saving auth:

```bash
cline --provider openai \
--model gpt-5.5 \
--json \
--auto-approve false \
'Reply with exactly: cline ok'
```

For optional operator MCP in Cline CLI, add the remote server to
`~/.cline/mcp.json`. Codex Pooler does not require this for model use. The VS
Code extension opens its own MCP settings JSON from the Cline MCP Servers panel;
use the same `mcpServers` shape there.

```json
{
"mcpServers": {
"codex_pooler": {
"url": "http://localhost:4000/mcp",
"headers": {
"Authorization": "Bearer "
},
"disabled": false,
"autoApprove": []
}
}
}
```

For deployed instances, change `--baseurl` to `https://codex-pooler.example.com/v1`
and, if you keep the optional operator MCP add-on, change the MCP `url` to
`https://codex-pooler.example.com/mcp`.

Use a Pool API key for `/v1` model requests and an operator MCP token for
`/mcp`. Do not reuse the Pool API key for MCP.

Goose logo Goose ~/.config/goose/config.yaml

Configure Goose's OpenAI provider for Codex Pooler's OpenAI-compatible
chat-completions path. Keep the Pool API key in `OPENAI_API_KEY` or Goose's
secret storage.

```yaml
GOOSE_PROVIDER: openai
GOOSE_MODEL: gpt-5.5
OPENAI_HOST: http://localhost:4000
OPENAI_BASE_PATH: v1/chat/completions
```

Check the headless CLI path with tool access enabled:

```bash
export OPENAI_API_KEY="$CODEX_POOLER_API_KEY"
goose run \
--no-session \
--provider openai \
--model gpt-5.5 \
--with-builtin developer \
--text 'Use your developer tool to create goose-ok.txt containing exactly: goose ok. Then reply with exactly: goose ok'
```

For optional operator MCP metadata access, add a remote Streamable HTTP
extension. Codex Pooler model use does not require this. Goose stores remote
extension headers in its config, so use a dedicated MCP token.

```yaml
# Optional operator-only MCP metadata add-on. Omit for model/runtime use.
extensions:
codex_pooler:
enabled: true
type: streamable_http
name: codex_pooler
uri: http://localhost:4000/mcp
headers:
Authorization: "Bearer "
timeout: 300
bundled: null
available_tools: []
```

For deployed instances, change `OPENAI_HOST` to `https://codex-pooler.example.com`;
if you keep the optional operator MCP add-on, change the extension `uri` to
`https://codex-pooler.example.com/mcp`.

Use a Pool API key for OpenAI-compatible model requests and an operator MCP token
for `/mcp`. Do not reuse the Pool API key for MCP.

Python logo OpenAI Python SDK

OpenAI Python SDK clients can use the OpenAI-compatible `/v1` surface by setting
`base_url` to the Codex Pooler `/v1` URL and using the Pool API key as the API
key.

```python
import os

from openai import OpenAI

client = OpenAI(
api_key=os.environ["CODEX_POOLER_API_KEY"],
base_url="http://localhost:4000/v1",
)

response = client.responses.create(
model="gpt-5.5",
input="Write a one-sentence status update.",
)

print(response.output_text)
```

For deployed instances, change `base_url` to `https://codex-pooler.example.com/v1`.

Node.js logo OpenAI Node SDK

OpenAI Node SDK clients use the same OpenAI-compatible `/v1` surface. Configure
`baseURL` with the Codex Pooler `/v1` URL and pass the Pool API key as the API
key.

```js
import OpenAI from "openai";

const client = new OpenAI({
apiKey: process.env.CODEX_POOLER_API_KEY,
baseURL: "http://localhost:4000/v1",
});

const response = await client.responses.create({
model: "gpt-5.5",
input: "Write a one-sentence status update.",
});

console.log(response.output_text);
```

For deployed instances, change `baseURL` to `https://codex-pooler.example.com/v1`.

Vercel logo Vercel AI SDK

Vercel AI SDK can point its OpenAI provider at Codex Pooler by creating a custom
provider with `createOpenAI`. The provider calls the OpenAI-compatible `/v1`
surface with the Pool API key.

```ts
import { createOpenAI } from "@ai-sdk/openai";
import { generateText } from "ai";

const pooler = createOpenAI({
apiKey: process.env.CODEX_POOLER_API_KEY,
baseURL: "http://localhost:4000/v1",
});

const { text } = await generateText({
model: pooler.responses("gpt-5.5"),
prompt: "Write a one-sentence status update.",
});

console.log(text);
```

For deployed instances, change `baseURL` to `https://codex-pooler.example.com/v1`.

Claude Code logo Claude Code

![Claude Code on Codex Pooler](.github/assets/codex-pooler-claude.png)

## Quick Start With Docker Compose

This runs the published release image with a local Postgres database. It is the
fastest way to try Codex Pooler on a laptop or small server.

Prerequisites:

- Docker with Compose
- Git, if you are cloning the repository
- `openssl`

Start Codex Pooler:

```bash
git clone https://github.com/icoretech/codex-pooler.git
cd codex-pooler

# Optional: pin a release tag before generating .env.
# Omit this for a quick trial that follows the latest tag.
# export CODEX_POOLER_IMAGE_TAG=

scripts/self-host/generate-env.sh
docker compose pull
docker compose up -d
```

The first run pulls the app and Postgres images, waits for Postgres health, runs
the migration container, then starts the web app.

Open `http://localhost:4000`. On the first visit, create the owner account at
`/bootstrap`, then sign in and start with `/admin/pools`.

To verify the first-run redirect before opening a browser:

```bash
curl -sS -D - -o /dev/null http://localhost:4000/ | grep -i '^location: /bootstrap'
curl -fsS http://localhost:4000/bootstrap/status
```

The status endpoint should return `{"status":"ok","bootstrap":"pending"}` on a
fresh database.

Useful commands:

```bash
docker compose ps
docker compose logs -f app
docker compose down
```

Use `http://localhost:4000` for the default Compose stack even if the Phoenix
startup banner prints an endpoint URL such as `https://localhost`; the Compose
port mapping is the local URL to open. The release image includes the OS
timezone database used for operator timezone display.

To remove the local database too:

```bash
docker compose down -v
```

## First Runtime Setup

After bootstrap:

1. Create a Pool in `/admin/pools`
2. Import or connect Codex accounts in `/admin/upstreams`
3. Create a Pool API key in `/admin/api-keys`
4. Point Codex or SDK clients at one of the runtime base URLs:

Treat an imported Codex `auth.json` as owned by Codex Pooler after import. Do
not keep using the same `auth.json` from another Codex install, machine, or
automation unless you accept that provider refresh-token rotation can invalidate
one copy and move the account to `reauth_required`.

```text
Codex backend base URL: http://localhost:4000/backend-api/codex
OpenAI SDK base URL: http://localhost:4000/v1
```

Use the generated Pool API key as the bearer token. That key represents the
Pool, not a single Codex account, so Codex Pooler can pick the best eligible
account for each request. Raw API keys are shown only once when created or
rotated.

## Operator Roles

The first bootstrap account is an `instance_owner`. Owners have instance-wide
administration access: they create Pools, assign operators to Pools, manage
operators, inspect global jobs, and change system settings.

Additional operators can be owners or `instance_admin`s. Instance admins are
Pool-scoped: they can work only with active Pools assigned to them and metadata
derived from those Pools. If no Pools are assigned, the admin UI shows empty
Pool-scoped states instead of exposing global data. Archiving or deleting a Pool
removes future instance-admin visibility for that Pool; historical request and
audit rows for archived or deleted Pools remain owner-only.

## Runtime Compatibility

Codex Pooler supports two client-facing shapes:

- **Codex backend clients:** `/backend-api/codex/*`, `/backend-api/files`,
`/backend-api/transcribe`, usage routes, and backend websocket response
streams
- **OpenAI-style SDK clients:** `/v1/models`, `/v1/responses`,
`/v1/chat/completions`, `/v1/files`, `/v1/audio/transcriptions`, selected
image endpoints, and narrow Responses websocket compatibility on
`GET /v1/responses`

The `/v1` surface is compatibility, not a second engine. Supported requests are
translated into Codex-compatible calls, then routed through the same Pool rules,
limit checks, accounting, and account selection path. `/v1/realtime` and OpenAI
Realtime SDK websocket or session routes are not supported.

Continuity headers are local routing inputs. Codex Pooler chooses them in this
order: `x-codex-window-id` > `x-codex-session-id` > `session-id` >
`x-session-affinity` > `session_id` > `x-codex-conversation-id`. `session-id` and
`x-session-affinity` are not forwarded upstream. The raw `x-codex-window-id` value is hashed before it becomes a local persisted session key. Local timing regressions showed
`/v1/responses` HTTP streaming and Responses websocket paths stay inside the
observed client budgets with the existing stream timeout settings, so no new
route-specific timeout defaults are required.

## Operator MCP Service

Codex Pooler includes an optional metadata-only MCP endpoint at `/mcp` for
trusted operators who want an MCP host to inspect Pools, upstream accounts, Pool
API key metadata, operators, invites, request logs, audit logs, and MCP service
status. This operator add-on is not required for Codex Pooler runtime clients.
The service is read-only and has no mutation tools. It uses the same owner vs
assigned-Pool visibility model as the admin UI, but connected MCP hosts can read
the metadata visible to that operator, so only connect hosts you trust with that
view.

MCP access uses operator-owned bearer MCP tokens, not Pool API keys, browser
sessions, cookies, query tokens, invite tokens, upstream tokens, or custom
headers. Operators manage their own MCP account gate and tokens from
`/admin/settings?tab=account`; the instance-wide service gate is managed from
`/admin/system`. Both gates must be enabled before a token works. Raw MCP tokens
are shown only once when created, and per-key usage tracking, counters, last IP,
and user-agent history are intentionally not stored.

The `/mcp` route inherits the runtime ingress IP allowlist and trusted-proxy
settings. If the allowlist is empty, the firewall is off; if it is configured,
the resolved client IP must match before MCP authentication or tool dispatch.

## Configuration

`scripts/self-host/generate-env.sh` writes a local `.env` with generated
secrets and local defaults. Keep that file private and don't reuse generated
values between public installs.

Environment variables are only for values the release needs before it can read
the database:

- `CODEX_POOLER_IMAGE` and `CODEX_POOLER_IMAGE_TAG`, the release image to run
- `CODEX_POOLER_HTTP_PORT`, the local host port, default `4000`
- `DATABASE_URL`, the Postgres connection used by the app
- `SECRET_KEY_BASE`, Phoenix signing and encryption secret
- `PHX_HOST`, `PORT`, and `PHX_SERVER`, HTTP endpoint boot settings
- `OBAN_MODE` and `OBAN_JOBS_QUEUE_LIMIT`, release role and queue topology
- `DNS_CLUSTER_QUERY`, plus release distribution variables when clustering is on
- `CODEX_POOLER_TOTP_ENCRYPTION_KEY` and `CODEX_POOLER_TOTP_KEY_VERSION`, TOTP
encryption root and version
- `CODEX_POOLER_UPSTREAM_SECRET_KEY` and
`CODEX_POOLER_UPSTREAM_SECRET_KEY_VERSION`, upstream secret encryption root
and version; the key must be 32 raw bytes or base64-encoded 32 bytes

Operational controls such as file limits, ingress trust, gateway diagnostics,
route-class admission, circuit thresholds, metrics auth, operator email, model
metadata, upstream timeouts, the OpenAI pricing catalog URL, and SMTP delivery
live in DB-managed Instance Settings under `/admin/system`. Live settings apply
to new runtime work through the settings cache. Cached settings reload after save
through PubSub invalidation; existing leases, in-flight requests, and already-open
streams keep the values they started with.

Secret Instance Settings stay write-only in the UI. The metrics bearer token is
stored only as a keyed HMAC digest, fingerprint, and key version. The SMTP
password is stored encrypted with key version metadata and is recovered only for
mail send or credential-test paths.

## Deployment

Docker Compose is the easiest way to try the software. For Kubernetes, use the
`icoretech/codex-pooler` Helm chart from the
[iCoreTech Helm repository](https://github.com/icoretech/helm). The chart
deploys the same release image with separate app, worker, scheduler, and
migration roles. It expects an explicit immutable image tag for real
deployments. Official release images include the OS IANA timezone database used
for operator timezone display. Custom runtime images or hosts must provide
zoneinfo files at `/usr/share/zoneinfo` or set `TZDIR`. The chart defaults the
web app to one replica because backend
websocket continuity owns a live upstream websocket in an app pod. Owner-alive
cross-node forwarding is wired, but scaling web replicas still requires
clustering, owner-forwarding, and the explicit unsafe topology acknowledgement
until Kubernetes smoke evidence relaxes that guard.

The Helm migration hook runs database migrations and imports the vendored OpenAI
pricing feed so request-log cost reporting has pricing snapshots after install
or upgrade. The scheduler also refreshes pricing hourly from the OpenAI pricing
catalog URL in Instance Settings, which defaults to
`https://icoretech.github.io/openai-json-pricing/pricing.json`.

## Local Development

Local development runs Phoenix on the host and Postgres through the dev compose
file:

```bash
make dev
```

`make dev` starts Postgres, prepares the database, imports the vendored OpenAI
pricing feed, and starts the Phoenix server on `http://localhost:4000`. Logs
are written to `tmp/dev-server.log`.

Development seeds are optional and only run through the explicit seed task. To
create a compact idempotent operator baseline with one owner plus four example
operators, run:

```bash
mix dev.seed compact
```

All seeded operators use `dev-password-123`.

To recreate a fuller fake dataset for exercising admin UI states without real
accounts or real request data, run:

```bash
mix dev.seed full
```

The full seed is idempotent and replaces only deterministic `dev-*` fake rows
owned by the development seed namespace. It includes active/disabled pools,
active/paused/revoked API keys, upstream accounts in active/refresh/reauth/paused
states, quota windows, request logs, invites, audit events, and job rows.

Common checks:

```bash
mix precommit
mix quality
docker compose -f docker-compose.dev.yml config
docker build .
```

Helm chart validation lives with the published chart in the iCoreTech Helm
repository when Kubernetes deployment behavior or values change.

`mix test` and `mix precommit` serialize database-backed test runs with a
PostgreSQL advisory lock keyed by the configured test database, so concurrent
local runs wait instead of deadlocking the shared sandbox database.