{"id":29830819,"url":"https://github.com/finite-sample/analytic-bw","last_synced_at":"2025-10-17T09:38:03.206Z","repository":{"id":303133707,"uuid":"1014512931","full_name":"finite-sample/analytic-bw","owner":"finite-sample","description":"Analytic Bandwidth Selector for NW and KDE Based on CV Hessian","archived":false,"fork":false,"pushed_at":"2025-08-17T19:53:23.000Z","size":60,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-09-26T23:38:27.912Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/finite-sample.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-05T21:50:59.000Z","updated_at":"2025-08-17T19:53:25.000Z","dependencies_parsed_at":"2025-08-12T13:47:29.687Z","dependency_job_id":null,"html_url":"https://github.com/finite-sample/analytic-bw","commit_stats":null,"previous_names":["finite-sample/analytic-nw-bw","finite-sample/analytic-bw"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/finite-sample/analytic-bw","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Fanalytic-bw","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Fanalytic-bw/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Fanalytic-bw/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Fanalytic-bw/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/finite-sample","download_url":"https://codeload.github.com/finite-sample/analytic-bw/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Fanalytic-bw/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279318361,"owners_count":26147231,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-17T02:00:07.504Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-07-29T10:11:36.982Z","updated_at":"2025-10-17T09:38:03.176Z","avatar_url":"https://github.com/finite-sample.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Analytic‑Hessian Bandwidth Selection\n\nA *single* Newton step can pick the optimal kernel bandwidth if you hand it the right derivatives.  We derive **closed‑form gradients *and* Hessians** of the leave‑one‑out cross‑validation (LOOCV) risk for\n\n* univariate **kernel‑density estimation** (KDE), and\n* univariate **Nadaraya–Watson** (NW) kernel regression,\n\ncovering both the Gaussian and Epanechnikov kernels.  The result is a bandwidth selector that reaches the same minimum as an exhaustive grid scan while using *an order of magnitude* fewer evaluations.\n\n---\n\n## 1  Problem \u0026 Prior Practice\n\n### 1.1  KDE bandwidth\n\nFor density estimation one minimises finite‑sample risk proxies such as\n\n* **LSCV** (least‑squares CV)\n* **LCV** (likelihood CV)\n\nCommon optimisers\n\n| Optimiser                               | Typical calls | Notes                 |\n| --------------------------------------- | ------------- | --------------------- |\n| Grid (50–100 \\$h\\$’s)                   | 50–100        | textbook default      |\n| Golden‑section                          | 20–25         | still bracketing      |\n| Plug‑in / Pilot (Sheather–Jones, Botev) | 1             | relies on asymptotics |\n\n### 1.2  NW bandwidth\n\nFor regression one often minimises the LOOCV mean‑squared‑error (MSE) surface.  Again, the standard choice is a grid over 40–60 bandwidths.\n\n\u003e **Gap** All prior work optimises by *searching* the CV surface.  Very little exploits its analytic structure beyond a first derivative.\n\n---\n\n## 2  Newton–Armijo with Analytic Hessian\n\nLet \\$L(h)\\$ denote the CV score (LSCV for KDE, LOOCV‑MSE for NW).  In log‑bandwidth space \\$u=\\log h\\$ we compute\n$g(u)=\\frac{dL}{du},\\qquad H(u)=\\frac{d^2L}{du^2}.$\nWith those we run\n\n```pseudo\nrepeat until |Δu| \u003c 1e‑6 or max_iter:\n    step ← −g/H           # Newton direction\n    u    ← Armijo(u,step) # back‑track to guarantee descent\n```\n\n* **Analytic derivatives** for both kernels avoid numerical differencing.\n* **Armijo line search** keeps stability when \\$n\\$ is tiny (non‑convex wiggles).\n* **Cost** = one score evaluation per back‑track (6–12 total).\n\nClosed‑form expressions are given in `derivatives.py` – two lines each.\n\n---\n\n## 3  Simulation Design\n\n| Component            | KDE                                                       | Nadaraya–Watson                                                                              |\n| -------------------- | --------------------------------------------------------- | -------------------------------------------------------------------------------------------- |\n| True function        | 50‑50 mix of \\$\\mathcal N(-2,0.5)\\$ \u0026 \\$\\mathcal N(2,1)\\$ | \\$y=f(x)+\\varepsilon\\$, same \\$f\\$ as mixture CDF, \\$\\varepsilon\\sim\\mathcal N(0,\\sigma^2)\\$ |\n| Sample sizes         | \\$n\\in{100,200,500}\\$                                     | same                                                                                         |\n| Noise std \\$\\sigma\\$ | \\${0.5,1,2}\\$                                             | same                                                                                         |\n| Kernels              | Gaussian \u0026 Epanechnikov                                   | Gaussian \u0026 Epanechnikov                                                                      |\n| Replicates           | \\$R=20\\$                                                  | \\$R=20\\$                                                                                     |\n| Risk metric          | ISE on \\[\\$-8,8\\$]                                        | test‑set MSE (10 000 pts)                                                                    |\n| Methods              | Grid, Golden, Newton–Armijo, Silverman                    | Grid, Golden, Newton–Armijo, Plug‑in                                                         |\n\nScripts: `kde_analytic_hessian.py`, `nw_analytic_hessian.py`.\n\n---\n\n## 4  Results Summary\n\nSee notebooks for results\n\n---\n\n## 5  Take‑aways\n\n* **Same optimum, fewer calls.** Newton reaches the exact grid minimum for both problems with 4–12× fewer evaluations.\n* **Kernel generality.** The derivation is only two lines per kernel; extending to other polynomial kernels is trivial.\n* **Small‑sample stability.** Armijo back‑tracking prevents the overshoot hiccups often seen with Epanechnikov near tiny \\$h\\$.\n\n---\n\n## 6  Related Work \u0026 Novelty\n\n| Reference                      | Setting    | Optimiser         | Criterion            | Notes                                |\n| ------------------------------ | ---------- | ----------------- | -------------------- | ------------------------------------ |\n| Loader 1999; Wand \u0026 Jones 1995 | KDE        | Grid / golden     | LSCV                 | Textbook standard                    |\n| Chiu 1992                      | KDE        | Newton            | **GCV** only         | No Hessian; Gaussian kernel only     |\n| Fan \u0026 Gijbels 1995             | Local poly | Iterative plug‑in | Asymptotic MISE      | Different objective                  |\n| Härdle (LOLVC)                 | NW         | Grid              | LOOCV                | Widely cited                         |\n| **This work**                  | KDE \u0026 NW   | **Newton–Armijo** | **Exact LOOCV/LSCV** | First analytic Hessian; 4–12× faster |\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffinite-sample%2Fanalytic-bw","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffinite-sample%2Fanalytic-bw","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffinite-sample%2Fanalytic-bw/lists"}