{"id":43906806,"url":"https://github.com/williamwinkler/openai-batch-manager","last_synced_at":"2026-03-03T11:12:42.001Z","repository":{"id":335911427,"uuid":"1082171617","full_name":"williamwinkler/openai-batch-manager","owner":"williamwinkler","description":"OpenAI Batch Manager helps you reduce OpenAI API costs for non-urgent workloads with a simple “send request, get result later” workflow using the Batch API.","archived":false,"fork":false,"pushed_at":"2026-02-06T22:33:31.000Z","size":568,"stargazers_count":1,"open_issues_count":1,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-02-07T02:50:16.143Z","etag":null,"topics":["ash-framework","elixir","oban","openai","openai-batch","prompts","rabbitmq","webhook"],"latest_commit_sha":null,"homepage":"","language":"Elixir","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/williamwinkler.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2025-10-23T20:54:21.000Z","updated_at":"2026-02-06T22:33:21.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/williamwinkler/openai-batch-manager","commit_stats":null,"previous_names":["williamwinkler/openai-batch-manager"],"tags_count":8,"template":false,"template_full_name":null,"purl":"pkg:github/williamwinkler/openai-batch-manager","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/williamwinkler%2Fopenai-batch-manager","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/williamwinkler%2Fopenai-batch-manager/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/williamwinkler%2Fopenai-batch-manager/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/williamwinkler%2Fopenai-batch-manager/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/williamwinkler","download_url":"https://codeload.github.com/williamwinkler/openai-batch-manager/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/williamwinkler%2Fopenai-batch-manager/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29333927,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-11T12:42:24.625Z","status":"ssl_error","status_checked_at":"2026-02-11T12:41:23.344Z","response_time":97,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ash-framework","elixir","oban","openai","openai-batch","prompts","rabbitmq","webhook"],"created_at":"2026-02-06T20:03:43.791Z","updated_at":"2026-03-03T11:12:41.996Z","avatar_url":"https://github.com/williamwinkler.png","language":"Elixir","funding_links":[],"categories":[],"sub_categories":[],"readme":"# OpenAI Batch Manager\n\nOpenAI Batch Manager is a self-hosted service that turns the [OpenAI Batch API](https://developers.openai.com/api/docs/guides/batch/) into a simple workflow:\nsend requests now, receive results later (via webhook or RabbitMQ).\n\nIt includes an interactive UI at [http://localhost:4000](http://localhost:4000).\n\n![ui](/docs/ui.png)\n\n## Quickstart (Docker Compose)\n\nThis repo includes a `docker-compose.yml` that runs Postgres + OpenAI Batch Manager.\n\n```bash\ncp .env.example .env\n# edit .env and set OPENAI_API_KEY\ndocker compose up -d --build\n```\n\n`docker compose` auto-loads `.env` automatically, so `source .env` is not required.\n`docker-compose.yml` provisions Postgres and sets a default internal `DATABASE_URL`.\nIf you are using an external Postgres instance, set `DATABASE_URL` in `.env`.\n\nThen open:\n\n- UI: [http://localhost:4000](http://localhost:4000)\n- Health check: [http://localhost:4000/health](http://localhost:4000/health)\n- OpenAPI JSON: [http://localhost:4000/api/openapi](http://localhost:4000/api/openapi)\n- Swagger UI: [http://localhost:4000/api/swaggerui](http://localhost:4000/api/swaggerui)\n\nOptional: enable RabbitMQ intake/delivery\n\n1. In `docker-compose.yml`, uncomment the `rabbitmq` service and the `RABBITMQ_*` env var lines under `openai-batch-manager`.\n2. Set:\n\n```env\nRABBITMQ_URL=amqp://guest:guest@rabbitmq:5672\nRABBITMQ_INPUT_QUEUE=batch_requests\n```\n\n## Send a Request\n\nSubmit requests to the service, and it will batch/upload/poll/download and deliver results.\n`custom_id` must be globally unique across all requests.\n\nWebhook delivery:\n\n```bash\ncurl -sS -X POST http://localhost:4000/api/requests \\\n  -H 'content-type: application/json' \\\n  -d '{\n    \"custom_id\": \"example_webhook_001\",\n    \"url\": \"/v1/responses\",\n    \"method\": \"POST\",\n    \"body\": {\n      \"model\": \"gpt-4o-mini\",\n      \"input\": \"Return a JSON object with a single key: answer\"\n    },\n    \"delivery_config\": {\n      \"type\": \"webhook\",\n      \"webhook_url\": \"https://example.com/webhook\"\n    }\n  }'\n```\n\nRabbitMQ delivery (queue-only):\n\n```bash\ncurl -sS -X POST http://localhost:4000/api/requests \\\n  -H 'content-type: application/json' \\\n  -d '{\n    \"custom_id\": \"example_rabbitmq_001\",\n    \"url\": \"/v1/responses\",\n    \"method\": \"POST\",\n    \"body\": {\n      \"model\": \"gpt-4o-mini\",\n      \"input\": \"Write a one sentence summary of: OpenAI Batch Manager\"\n    },\n    \"delivery_config\": {\n      \"type\": \"rabbitmq\",\n      \"rabbitmq_queue\": \"batch_results\"\n    }\n  }'\n```\n\n## Why Use This?\n\nThe OpenAI Batch API is powerful, but it turns a single request into a workflow:\nbuild the batch file, upload, poll status, handle partial failures/expired batches, download outputs, and deliver results.\n\nOpenAI Batch Manager abstracts that workflow away. You get:\n\n- A single intake API (`POST /api/requests`) with OpenAPI docs.\n- Automatic batch creation + upload + status polling + result download.\n- Delivery attempts with audit trail in the UI.\n- Automatic cleanup of completed/expired/stale batches.\n\n## How It Works\n\n1. **Submit** requests (via REST API, optionally via RabbitMQ intake queue).\n2. **Batch** by model/endpoint and persist locally.\n3. **Upload** to OpenAI Batch API (by default, within about an hour).\n4. **Poll** until OpenAI finishes processing.\n5. **Download** output/error files and parse per-request results.\n6. **Deliver** each result to your destination (webhook or RabbitMQ).\n\n![diagram](/docs/how_it_works_diagram.png)\n\n## Configuration\n\n| Variable | Required | Purpose |\n|----------|:--------:|---------|\n| `OPENAI_API_KEY` | Yes | OpenAI API key used to create and poll batches. |\n| `DATABASE_URL` | Yes | Postgres connection string (Ecto format). |\n| `PORT` | No | HTTP port (default: `4000`). |\n| `RABBITMQ_URL` | No | Enables RabbitMQ output delivery, and input consumption if `RABBITMQ_INPUT_QUEUE` is set. |\n| `RABBITMQ_INPUT_QUEUE` | No | Enables RabbitMQ intake from this queue name (requires `RABBITMQ_URL`). |\n| `DISABLE_DELIVERY_RETRY` | No | When true, delivery attempts are not retried. |\n| `DELIVERY_QUEUE_CONCURRENCY` | No | Number of delivery workers (default: `8`). Lower values reduce DB pressure spikes; higher values improve throughput. |\n| `DELIVERY_ENQUEUE_CHUNK_SIZE` | No | Number of requests enqueued per chunk when a batch starts delivering (default: `200`). |\n| `DELIVERY_ENQUEUE_MAX_ERROR_LOGS` | No | Maximum per-request enqueue warning logs before suppressing repetitive failures (default: `5`). |\n\n## Operational Notes\n\n- Data artifacts live under `/data/batches` in the container. Postgres stores metadata (batches, requests, delivery attempts).\n- Batches and their requests are automatically cleaned up:\n  - **Completed batches** are deleted locally once their OpenAI file expires (30 days after upload).\n  - **Stale building batches** that have been idle for over 1 hour are either uploaded (if they contain requests) or deleted (if empty).\n  - **Expired OpenAI batches** (24h processing timeout) have their partial results downloaded and unprocessed requests resubmitted in a new batch.\n\nWhen a batch is deleted locally, its associated OpenAI files (input, output, error) are also cleaned up on the OpenAI platform.\n\nDelivery stability defaults are tuned to avoid bursty DB lock pressure:\n\n- Delivery worker concurrency defaults to `8`.\n- Delivery enqueue fanout is chunked (`200` requests per chunk).\n- Repetitive enqueue failures are summarized to keep logs actionable.\n\nFor very high load, reduce `DELIVERY_QUEUE_CONCURRENCY` and/or `DELIVERY_ENQUEUE_CHUNK_SIZE`.\nIf throughput is too slow and Postgres is healthy, increase concurrency gradually.\n\n## Limitations / Not Supported\n\n- No built-in authentication/authorization. Run behind a reverse proxy, VPN, or private network.\n- Not a low-latency API and not a streaming API; this is for asynchronous batch workloads.\n- Delivery should be treated as at-least-once; make your webhook/RabbitMQ consumers idempotent.\n- RabbitMQ delivery is queue-only (`rabbitmq_queue`); custom exchanges/routing keys are not supported (yet).\n- Only OpenAI Batch API workflows are in scope (not a multi-provider LLM router).\n\n## Development (From Source)\n\nYou need Elixir/Erlang (e.g. [asdf](https://asdf-vm.com/) with `.tool-versions`), Postgres, and an OpenAI API key.\nFor `mix` development, this app reads `DATABASE_URL_DEV` from `config/dev.exs` (not `DATABASE_URL`).\n\n```bash\ncp .env.example .env\n# edit .env and set OPENAI_API_KEY\n\n# optional: if your local Postgres is not the default\nexport DATABASE_URL_DEV=\"ecto://postgres:postgres@localhost:5432/openai_batch_manager_dev\"\n\nasdf install\nmix setup\niex -S mix phx.server\n```\n\nIf you keep the default local Postgres settings, you can skip `DATABASE_URL_DEV`.\n\n## Development Commands\n\n```bash\nmix test\nmix format\nmix precommit\n```\n\nTo run RabbitMQ tests locally, run `mix test --include rabbitmq` with a RabbitMQ instance available.\n\n## Contributing\n\nOpen an issue for bugs or ideas. PRs welcome—keep changes focused, add tests where relevant, run `mix precommit` before submitting.\n\nCurrent baseline:\n\n1. `mix precommit` is the required local/CI quality gate before merging.\n2. `mix format --check-formatted` must pass.\n3. No committed tooling/noise artifacts (for example `.DS_Store` and local logs).\n4. Keep `Ash.*` orchestration out of web modules (`lib/batcher_web`) unless explicitly justified.\n\n## License\n\n[MIT](LICENSE) — use, modify, and redistribute; keep the license and copyright notice in redistributed code.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwilliamwinkler%2Fopenai-batch-manager","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwilliamwinkler%2Fopenai-batch-manager","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwilliamwinkler%2Fopenai-batch-manager/lists"}