{"id":13689050,"url":"https://github.com/scikit-hep/hepstats","last_synced_at":"2025-04-12T22:30:35.701Z","repository":{"id":34999648,"uuid":"191397511","full_name":"scikit-hep/hepstats","owner":"scikit-hep","description":"Statistics tools and utilities. ","archived":false,"fork":false,"pushed_at":"2025-03-05T11:41:01.000Z","size":68003,"stargazers_count":75,"open_issues_count":6,"forks_count":16,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-04-04T02:09:17.159Z","etag":null,"topics":["hep","hep-ex","python","python3","scikit-hep","statistical-inference","statistics"],"latest_commit_sha":null,"homepage":"https://scikit-hep.org/hepstats/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/scikit-hep.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.rst","contributing":null,"funding":null,"license":"LICENSES/LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-06-11T15:14:27.000Z","updated_at":"2025-03-25T07:08:25.000Z","dependencies_parsed_at":"2023-02-18T08:31:24.060Z","dependency_job_id":"5c6214b6-6839-4aa8-b803-6f0b99636aef","html_url":"https://github.com/scikit-hep/hepstats","commit_stats":{"total_commits":532,"total_committers":11,"mean_commits":48.36363636363637,"dds":0.5469924812030076,"last_synced_commit":"30e8ff0f2b115a9b29ae7c27af0047f549771e10"},"previous_names":["scikit-hep/scikit-stats"],"tags_count":22,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scikit-hep%2Fhepstats","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scikit-hep%2Fhepstats/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scikit-hep%2Fhepstats/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scikit-hep%2Fhepstats/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/scikit-hep","download_url":"https://codeload.github.com/scikit-hep/hepstats/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248639647,"owners_count":21137882,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["hep","hep-ex","python","python3","scikit-hep","statistical-inference","statistics"],"created_at":"2024-08-02T15:01:32.107Z","updated_at":"2025-04-12T22:30:35.681Z","avatar_url":"https://github.com/scikit-hep.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"\u003cimg src=\"https://raw.githubusercontent.com/scikit-hep/hepstats/master/docs/images/logo.png\" width=\"450\"\u003e\n\n\n# `hepstats` package: statistics tools and utilities\n\n[![Scikit-HEP][sk-badge]](https://scikit-hep.org/)\n\n[![PyPI](https://img.shields.io/pypi/v/hepstats)](https://pypi.org/project/hepstats/)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/hepstats)](https://pypi.org/project/hepstats/)\n[![Conda latest release](https://img.shields.io/conda/vn/conda-forge/hepstats.svg)](https://anaconda.org/conda-forge/hepstats)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3519200.svg)](https://doi.org/10.5281/zenodo.3519200)\n\n![CI](https://github.com/scikit-hep/hepstats/workflows/CI/badge.svg)\n[![codecov](https://codecov.io/gh/scikit-hep/hepstats/branch/master/graph/badge.svg)](https://codecov.io/gh/scikit-hep/hepstats)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/scikit-hep/hepstats/master)\n\nhepstats is a library for statistical inference aiming to cover the needs High Energy Physics.\nIt is part of the [Scikit-HEP project](https://scikit-hep.org/).\n\n**Questions**: for usage questions, use [StackOverflow with the hepstats tag](https://stackoverflow.com/questions/ask?tags=hepstats)\n**Bugs and odd behavior**: open [an issue with hepstats](https://github.com/scikit-hep/hepstats/issues/new)\n\n## Installation\n\nInstall `hepstats` like any other Python package:\n\n```\npip install hepstats\n```\n\nor similar (use e.g. `virtualenv` if you wish).\n\n## Changelog\nSee the [changelog](https://github.com/scikit-hep/hepstats/blob/master/CHANGELOG.md) for a history of notable changes.\n\n## Getting Started\n\nThe `hepstats` module includes `modeling`, `hypotests` and `splot` submodules. This a quick user guide to each submodule. The [binder](https://mybinder.org/v2/gh/scikit-hep/hepstats/master) examples are also a good way to get started.\n\n### modeling\n\nThe modeling submodule includes the [Bayesian Block algorithm](https://arxiv.org/pdf/1207.5578.pdf) that can be used to improve the binning of histograms. The visual improvement can be dramatic, and more importantly, this algorithm produces histograms that accurately represent the underlying distribution while being robust to statistical fluctuations. Here is a small example of the algorithm applied on Laplacian sampled data, compared to a histogram of this sample with a fine binning.\n\n```python\n\u003e\u003e\u003e import numpy as np\n\u003e\u003e\u003e import matplotlib.pyplot as plt\n\u003e\u003e\u003e from hepstats.modeling import bayesian_blocks\n\n\u003e\u003e\u003e data = np.random.laplace(size=10000)\n\u003e\u003e\u003e blocks = bayesian_blocks(data)\n\n\u003e\u003e\u003e plt.hist(data, bins=1000, label='Fine Binning', density=True, alpha=0.6)\n\u003e\u003e\u003e plt.hist(data, bins=blocks, label='Bayesian Blocks', histtype='step', density=True, linewidth=2)\n\u003e\u003e\u003e plt.legend(loc=2)\n```\n\n![bayesian blocks example](https://raw.githubusercontent.com/scikit-hep/hepstats/master/notebooks/modeling/bayesian_blocks_example.png)\n\n### hypotests\n\nThis submodule provides tools to do hypothesis tests such as discovery test and computations of upper limits or confidence intervals. hepstats needs a fitting backend to perform computations such as [zfit](https://github.com/zfit/zfit). Any fitting library can be used if their API is compatible  with hepstats (see [api checks](https://github.com/scikit-hep/hepstats/blob/master/hepstats/hypotests/utils/fit/api_check.py)).\n\nWe give here a simple example of an upper limit calculation of the yield of a Gaussian signal with known mean and sigma over an exponential background. The fitting backend used is the [zfit](https://github.com/zfit/zfit) package. An example with a **counting experiment** analysis is also given in the [binder](https://mybinder.org/v2/gh/scikit-hep/hepstats/master) examples.\n\n```python\n\u003e\u003e\u003e import zfit\n\u003e\u003e\u003e from zfit.loss import ExtendedUnbinnedNLL\n\u003e\u003e\u003e from zfit.minimize import Minuit\n\n\u003e\u003e\u003e bounds = (0.1, 3.0)\n\u003e\u003e\u003e obs = zfit.Space('x', limits=bounds)\n\n\u003e\u003e\u003e bkg = np.random.exponential(0.5, 300)\n\u003e\u003e\u003e peak = np.random.normal(1.2, 0.1, 10)\n\u003e\u003e\u003e data = np.concatenate((bkg, peak))\n\u003e\u003e\u003e data = data[(data \u003e bounds[0]) \u0026 (data \u003c bounds[1])]\n\u003e\u003e\u003e N = data.size\n\u003e\u003e\u003e data = zfit.Data.from_numpy(obs=obs, array=data)\n\n\u003e\u003e\u003e lambda_ = zfit.Parameter(\"lambda\", -2.0, -4.0, -1.0)\n\u003e\u003e\u003e Nsig = zfit.Parameter(\"Nsig\", 1., -20., N)\n\u003e\u003e\u003e Nbkg = zfit.Parameter(\"Nbkg\", N, 0., N*1.1)\n\u003e\u003e\u003e signal = zfit.pdf.Gauss(obs=obs, mu=1.2, sigma=0.1).create_extended(Nsig)\n\u003e\u003e\u003e background = zfit.pdf.Exponential(obs=obs, lambda_=lambda_).create_extended(Nbkg)\n\u003e\u003e\u003e total = zfit.pdf.SumPDF([signal, background])\n\u003e\u003e\u003e loss = ExtendedUnbinnedNLL(model=total, data=data)\n\n\u003e\u003e\u003e from hepstats.hypotests.calculators import AsymptoticCalculator\n\u003e\u003e\u003e from hepstats.hypotests import UpperLimit\n\u003e\u003e\u003e from hepstats.hypotests.parameters import POI, POIarray\n\n\u003e\u003e\u003e calculator = AsymptoticCalculator(loss, Minuit(), asimov_bins=100)\n\u003e\u003e\u003e poinull = POIarray(Nsig, np.linspace(0.0, 25, 20))\n\u003e\u003e\u003e poialt = POI(Nsig, 0)\n\u003e\u003e\u003e ul = UpperLimit(calculator, poinull, poialt)\n\u003e\u003e\u003e ul.upperlimit(alpha=0.05, CLs=True)\n\nObserved upper limit: Nsig = 15.725784747406346\nExpected upper limit: Nsig = 11.927442041887158\nExpected upper limit +1 sigma: Nsig = 16.596396280677116\nExpected upper limit -1 sigma: Nsig = 8.592750403611896\nExpected upper limit +2 sigma: Nsig = 22.24864429383046\nExpected upper limit -2 sigma: Nsig = 6.400549971360598\n```\n\n![upper limit example](https://raw.githubusercontent.com/scikit-hep/hepstats/master/notebooks/hypotests/asy_ul.png)\n\n### splots\n\nA full example using the **sPlot** algorithm can be found [here](https://github.com/scikit-hep/hepstats/tree/master/notebooks/splots/splot_example.ipynb). **sWeights** for different components in a data sample, modeled with a sum of extended probability density functions, are derived using the `compute_sweights` function:\n\n```python\n\u003e\u003e\u003e from hepstats.splot import compute_sweights\n\n# using same model as above for illustration\n\u003e\u003e\u003e sweights = compute_sweights(zfit.pdf.SumPDF([signal, background]), data)\n\n\u003e\u003e\u003e bkg_sweights = sweights[Nbkg]\n\u003e\u003e\u003e sig_sweights = sweights[Nsig]\n```\n\nThe model needs to be fitted to the data for the computation of the **sWeights**, if not an error is raised.\n\n[sk-badge]: https://img.shields.io/badge/Scikit--HEP-Project-blue?logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABoAAAAcCAYAAAB/E6/TAAAACXBIWXMAAAEZAAABGQHyCY1sAAAAGXRFWHRTb2Z0d2FyZQB3d3cuaW5rc2NhcGUub3Jnm+48GgAAA6dJREFUSImdlktonFUUx/930kQ0nYo2JX5NUqSghSq2oIgvcCFaC0EEH2AXUh9FEV1UUaGIC924qY8igo+FQi26sWp8gDS24qOgCOIiKCVomLlnZpxk7Dh2GjMz389F7tgv4zfNJGdzh/M/5/zuud/ch9MKDFgnaUjSnHOu2kuOmb0h6brMMoVHgceBY8ApSVVJ05JOAnXga+BJ4OK0fO/9PZL2AL91AwwBLwLz9GYLwKvAcLtGLpcbMbM5MyuXy+UoDXI14BNFcsABYBy4DLgojDvDZH5PxJaAG4CM937SzCgUCnemQcaB0yFpDngMGFhmefuBh4E/Qt4/tVrtoJlhZq+nJWwHaiH4F2DL2QAp+SPA9wBxHDM7O5svl8vZzqBzgOkAOQGsXwmkbbVabUOj0Wh/1xIw+J+Yy+UuBJ4O4jywdTUQSSoUCgdKpRJxHC+Ees8mxVKr1WoGYf9qId77m80sNrNvgedDvb+A8yQpMzg4OJHJZPoAVavVQ6uBmNmQc+4dSfVWq7Vb0n5JC5KyknZIUiabzdYlqdFoqF6vTxSLxctXwXpNUuSce3RsbOyEc+6kpKNBG5ekjKRLguMTSUNxHE/m8/ntvRK89w9IukvS4SiK3k5Ix8N4aRu0UZIGBgaOAHdIWpfJZI56769fDlIqlTY7515yzlkcx3s65xDGjW1Qf3A0R0ZGJpxzOyX1Oee+MLMd3SDAmjiOD0paK+nB0dHRuY6QhTD2t0EWHJEkRVF0zDl3k6TTkj5OPUIkmdkzwLWSXomi6POUkNFkZxlJM8GxrR0RRdEPzrkbnXOzwHve+/s7IFc55/ZJmmq1Wvu6NH1FGGfaS3B3YrMuOTJKpdJmM5sO+2OvJBWLxUEz+9XM5vP5/DalWDhpqqHu7rYzmzhI96Ys0aZQmEKh8IKZvRV+P9GlEwEPhXoNYCgpvByE2SXCmc6GzeyncCLjvZ8EUi9N4HygEOq92SmuB/4M4pdAf2eBmZmZC7z3lQDb1AXSB3wW6vwNpF54twOtEPQBsLYzplgsfmhmpHUDnAscCvkxsCttMu3gpzhjPwNLNq2ZHU4DsXgr/5jIfa4rJJF0H0vfCp8C9wLDSRCwAdgFfBQ6gMW3wyPLQhKwK4Gv+L81m80mwKkU7Tvgmp4hHcBbgXcTf5ROqwLvA7cBblWQDmA/sLVSqXxTqVQAbmHxJXTWh0vS1vQS5JxrSJoys3JwHXHOxSuZbE+ghE1J2rJSiCT9CxJT5EBIY81lAAAAAElFTkSuQmCC\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscikit-hep%2Fhepstats","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fscikit-hep%2Fhepstats","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscikit-hep%2Fhepstats/lists"}