{"id":29830824,"url":"https://github.com/finite-sample/mw-calibration","last_synced_at":"2025-07-29T10:11:52.975Z","repository":{"id":304789549,"uuid":"1019975774","full_name":"finite-sample/mw-calibration","owner":"finite-sample","description":"Always‑On Probability Calibration via Multiplicative‑Weights. Comparison to Batch Platt \u0026 Isotonic","archived":false,"fork":false,"pushed_at":"2025-07-15T06:49:15.000Z","size":108,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-15T15:20:03.717Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/finite-sample.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-15T06:46:53.000Z","updated_at":"2025-07-15T06:49:19.000Z","dependencies_parsed_at":"2025-07-15T15:46:39.029Z","dependency_job_id":"1ec94625-87bb-4d45-8482-88fc7b386181","html_url":"https://github.com/finite-sample/mw-calibration","commit_stats":null,"previous_names":["finite-sample/mw-calibration"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/finite-sample/mw-calibration","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Fmw-calibration","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Fmw-calibration/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Fmw-calibration/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Fmw-calibration/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/finite-sample","download_url":"https://codeload.github.com/finite-sample/mw-calibration/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Fmw-calibration/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267668843,"owners_count":24124972,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-29T02:00:12.549Z","response_time":2574,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-07-29T10:11:39.523Z","updated_at":"2025-07-29T10:11:52.247Z","avatar_url":"https://github.com/finite-sample.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Always‑On Probability Calibration via Multiplicative‑Weights\n\n## 1  Realistic production scenario\n\n**Context**  ▸ A large ad platform serves millions of impressions per hour.  An upstream ML model outputs raw click‑through probabilities \\$p^{\\text{raw}}\\$.  Over time, systematic **drift** appears (creative fatigue, seasonality, campaign mix).  Business KPIs and auctions require **well‑calibrated probabilities** at any moment.\n\n**Key constraint**  ▸ You can’t stop traffic to retrain a calibrator on every batch; compute must stay sub‑millisecond per impression.\n\n---\n\n## 2  What practitioners typically do\n\n| Method                       | Workflow                                                      | Pain‑point                                                           |\n| ---------------------------- | ------------------------------------------------------------- | -------------------------------------------------------------------- |\n| **Platt scaling** (logistic) | Train on yesterday’s data; deploy coefficients until next job | Loses calibration as drift grows; spikes CPU/latency when retrained. |\n| **Isotonic regression**      | Same nightly (or hourly) batch job; guarantees monotonicity   | Same drift issue; heavier CPU and memory if many segments.           |\n\n**Trade‑off**  ▸ Fewer retrains ⟶ lower compute, but larger calibration error between jobs.  More frequent retrains ⟶ accuracy stays tight, compute scales $\\mathcal O(N_{\\text{seen}})$ and may breach SLA.\n\n---\n\n## 3  Our proposal: **Vectorised Multiplicative‑Weights Update (MWU)**\n\n* Maintain one **bias weight** \\$c\\_b\\$ per calibration bucket / segment.\n* After each mini‑batch: update all buckets **once** via\n  $c_b \\leftarrow c_b\\,\\exp\\bigl(-\\eta\\,(\\hat r_b-\\tilde r_b)\\bigr),$\n  where \\$\\hat r\\_b\\$ is the batch click‑rate and \\$\\tilde r\\_b\\$ the predicted rate.\n* Complexity **$\\mathcal O(\\text{#buckets})$** regardless of events processed.\n* Adapts instantly to drift; no full refit, no heavy solver.\n\n---\n\n## 4  Simulation setup\n\n* **200 k** impressions streamed in **40 batches** (5 k each).\n* Upward probability drift encoded in the logit mean \\$\\mu\\_t\\$.\n* **100 reliability buckets.**\n* Compare per‑batch **Brier** \u0026 **CPU time**:\n\n  * Platt (logistic), Isotonic (PAV) — *retrained every batch* ➊\n  * **MWU** (vectorised bucket update).\n\n➊ We retrain each batch to show compute scaling. In practice retrain cadence is slower; see §5.\n\n---\n\n## 5  Results (aggregate over 40 batches)\n\n| Metric                   | Platt                    | Isotonic       | **MWU**                     |\n| ------------------------ | ------------------------ | -------------- | --------------------------- |\n| **Mean per‑batch Brier** | **0.2051**               | 0.2045         | 0.2052                      |\n| **Std (Brier)**          | 0.0019                   | **0.0017**     | 0.0019                      |\n| **Mean CPU s / batch**   | 0.0243                   | 0.0181         | **0.00039**                 |\n| **Compute scaling**      | grows linearly with data | grows linearly | \\~flat ($\\approx$ constant) |\n\n*Platt \u0026 Isotonic achieve slightly lower Brier—at the cost of ****60×‑100× more CPU****.*\n\n\u003e **If retrained hourly instead of per‑batch**: their compute would drop, but calibration error would **drift upward** between retrains; MWU keeps both error and compute flat.\n\n---\n\n## 6  Take‑aways\n\n* **MWU = always‑on calibrator** — cheap exponential updates keep probabilities aligned without offline jobs.\n* Offers a clean knob (learning‑rate \\$\\eta\\$) to trade stability vs. responsiveness.\n* Ideal for ad serving, recommender systems, or any high‑volume setting where **latency and continual drift** rule out heavy batch retrains.\n\n---\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffinite-sample%2Fmw-calibration","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffinite-sample%2Fmw-calibration","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffinite-sample%2Fmw-calibration/lists"}