{"id":15360525,"url":"https://github.com/devmotion/pycalibration","last_synced_at":"2025-04-15T08:21:24.518Z","repository":{"id":44132476,"uuid":"284147655","full_name":"devmotion/pycalibration","owner":"devmotion","description":"Estimation and hypothesis tests of calibration in Python using CalibrationErrors.jl and CalibrationTests.jl.","archived":false,"fork":false,"pushed_at":"2022-10-30T21:19:57.000Z","size":32,"stargazers_count":7,"open_issues_count":2,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-28T18:09:13.863Z","etag":null,"topics":["calibration","julia","python","reliability"],"latest_commit_sha":null,"homepage":"https://devmotion.github.io/CalibrationErrors.jl/dev","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/devmotion.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.bib","codeowners":null,"security":null,"support":null}},"created_at":"2020-07-31T23:22:45.000Z","updated_at":"2023-03-09T16:04:13.000Z","dependencies_parsed_at":"2023-01-20T04:31:46.992Z","dependency_job_id":null,"html_url":"https://github.com/devmotion/pycalibration","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devmotion%2Fpycalibration","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devmotion%2Fpycalibration/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devmotion%2Fpycalibration/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devmotion%2Fpycalibration/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/devmotion","download_url":"https://codeload.github.com/devmotion/pycalibration/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249031951,"owners_count":21201376,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["calibration","julia","python","reliability"],"created_at":"2024-10-01T12:50:27.538Z","updated_at":"2025-04-15T08:21:24.485Z","avatar_url":"https://github.com/devmotion.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# pycalibration\n\nEstimation and hypothesis tests of calibration in Python using CalibrationErrors.jl and CalibrationTests.jl.\n\n[![Stable](https://img.shields.io/badge/Julia%20docs-stable-blue.svg)](https://devmotion.github.io/CalibrationErrors.jl/stable)\n[![Dev](https://img.shields.io/badge/Julia%20docs-dev-blue.svg)](https://devmotion.github.io/CalibrationErrors.jl/dev)\n[![Status](https://github.com/devmotion/pycalibration/workflows/CI/badge.svg?branch=main)](https://github.com/devmotion/pycalibration/actions?query=workflow%3ACI+branch%3Amain)\n[![codecov](https://codecov.io/gh/devmotion/pycalibration/branch/main/graph/badge.svg?token=URGY32W5EB)](https://codecov.io/gh/devmotion/pycalibration)\n[![CalibrationErrors.jl Status](https://img.shields.io/github/workflow/status/devmotion/CalibrationErrors.jl/CI/main?label=CalibrationErrors.jl)](https://github.com/devmotion/CalibrationErrors.jl/actions?query=workflow%3ACI+branch%3Amain)\n[![CalibrationTests.jl Status](https://img.shields.io/github/workflow/status/devmotion/CalibrationTests.jl/CI/main?label=CalibrationTests.jl)](https://github.com/devmotion/CalibrationTests.jl/actions?query=workflow%3ACI+branch%3Amain)\n\npycalibration is a package for estimating calibration of probabilistic models in Python.\nIt is a Python interface for [CalibrationErrors.jl](https://github.com/devmotion/CalibrationErrors.jl) and [CalibrationTests.jl](https://github.com/devmotion/CalibrationTests.jl).\nAs such, the package allows the estimation of calibration errors (ECE and SKCE) and statistical testing of the null hypothesis that a model is calibrated.\n\n## Installation\n\nTo install `pycalibration`, run\n\n```shell\npython -m pip install git+https://github.com/devmotion/pycalibration.git\n```\n\nThe use of `pycalibration` requires that its dependency\n[`pyjulia`](https://github.com/JuliaPy/pyjulia) (installed automatically)\nand itself are configured correctly.\n\nFor `pyjulia`, you have to\n[install Julia](https://pyjulia.readthedocs.io/en/latest/installation.html#step-1-install-julia) (at least version 1.6 is required)\nand the\n[Julia dependencies of `pyjulia`](https://pyjulia.readthedocs.io/en/latest/installation.html#step-3-install-julia-packages-required-by-pyjulia).\nThe configuration process is described in detail in the\n[`pyjulia` documentation](https://pyjulia.readthedocs.io/en/latest/installation.html).\n\nWhen `pyjulia` is configured correctly, you can install the Julia packages required by\n`pycalibration` in the Python interpreter:\n\n```pycon\n\u003e\u003e\u003e import pycalibration\n\u003e\u003e\u003e pycalibration.install()\n```\n\n### Custom Julia environment\n\nWith the default settings, `pyjulia` and `pycalibration` install all Julia dependencies\nin the default environment. In particular, if you use Julia for other projects as well,\na separate [project environment](https://pkgdocs.julialang.org/v1/environments/) can\nsimplify package management and ensure that the state of the Julia dependencies is\nreproducible. In `pyjulia` and `pycalibration`, a custom project environment is used if\nyou set the environment variable `JULIA_PROJECT`:\n\n```shell\nexport JULIA_PROJECT=\"path/to/the/environment/\"\n```\n\n## Usage\n\nImport and setup calibration analysis tools from CalibrationErrors.jl and CalibrationTests.jl with\n```pycon\n\u003e\u003e\u003e from pycalibration import ca\n```\n\nYou can then do the same as would be done in Julia, except you have to add `ca.` in front for functionality from the Julia packages.\nMost of the commands will work without any modification.\nThus the documentation of the Julia packages is the main in-depth documentation for this package.\n\n### Valid identifiers\n\nNot all valid Julia identifiers are valid Python identifiers. This is an inherent\nlimitation of [Python and `pyjulia`](https://pyjulia.readthedocs.io/en/latest/limitations.html#mismatch-in-valid-set-of-identifiers). In particular, it is a common idiom in Julia to\nappend `!` to functions that mutate their arguments but it is not possible to use\n`!` in function names in Python. `pyjulia` renames these functions by substituting\n`!` with `_b`, e.g., you can call the Julia function `copy!` with `copy_b` in Python.\n\n### Calibration errors\n\nLet us estimate the squared kernel calibration error (SKCE) with the tensor\nproduct kernel\n```math\nk((p, y), (p̃, ỹ)) = exp(-|p - p̃|) δ(y - ỹ)\n```\nfrom a set of predictions and corresponding observed outcomes.\n\n```pycon\n\u003e\u003e\u003e skce = ca.SKCE(ca.tensor(ca.ExponentialKernel(), ca.WhiteKernel()))\n```\n\nOther estimators of the SKCE and estimators of other calibration errors such\nas the expected calibration error (ECE) are available as well. The Julia package\n[KernelFunctions.jl](https://github.com/JuliaGaussianProcesses/KernelFunctions.jl)\nsupports a variety of kernels, all compositions and transformations of\n[kernels available there](https://juliagaussianprocesses.github.io/KernelFunctions.jl/stable/kernels/)\ncan be used.\n\n#### Sequences of probabilities\n\nPredictions can be provided as sequences of probabilities. In this case, the\npredictions correspond to Bernoulli distributions with these parameters and the\ntargets are boolean values.\n\n```pycon\n\u003e\u003e\u003e import random\n\u003e\u003e\u003e random.seed(1234)\n\u003e\u003e\u003e predictions = [random.random() for _ in range(100)]\n\u003e\u003e\u003e outcomes = [bool(random.getrandbits(1)) for _ in range(100)]\n\u003e\u003e\u003e skce(predictions, outcomes)\n0.028399084017414655\n```\n\nNumPy arrays are supported as well.\n\n```pycon\n\u003e\u003e\u003e import numpy as np\n\u003e\u003e\u003e rng = np.random.default_rng(1234)\n\u003e\u003e\u003e predictions = rng.random(100)\n\u003e\u003e\u003e outcomes = rng.choice([True, False], 100)\n\u003e\u003e\u003e skce(predictions, outcomes)\n0.03320398246523166\n```\n\n#### Sequences of probability vectors\n\nPredictions can be provided as sequences of probability vectors (i.e., vectors\nin the probability simplex) as well. In this case, the predictions correspond to categorical\ndistributions with these class probabilities and the targets are integers in `{1,...,n}`.\n\n```pycon\n\u003e\u003e\u003e import numpy as np\n\u003e\u003e\u003e rng = np.random.default_rng(1234)\n\u003e\u003e\u003e predictions = [rng.dirichlet((3, 2, 5)) for _ in range(100)]\n\u003e\u003e\u003e outcomes = rng.integers(low=1, high=4, size=100)\n\u003e\u003e\u003e skce(predictions, outcomes)\n0.02015240706950358\n```\n\nSequences of probability vectors can also be provided as NumPy matrices. However, it is\nrequired to specify if the probability vectors correspond to rows or columns of the matrix\nby wrapping them in `ca.RowVecs` and `ca.ColVecs`, respectively. These wrappers are defined\nin [KernelFunctions.jl](https://github.com/JuliaGaussianProcesses/KernelFunctions.jl).\n\n```pycon\n\u003e\u003e\u003e import numpy as np\n\u003e\u003e\u003e rng = np.random.default_rng(1234)\n\u003e\u003e\u003e predictions = rng.dirichlet((3, 2, 5), 100)\n\u003e\u003e\u003e outcomes = rng.integers(low=1, high=4, size=100)\n\u003e\u003e\u003e skce(ca.RowVecs(predictions), outcomes)\n0.02015240706950358\n```\n\nThe wrappers have to be used also for, e.g., lists of lists since `pyjulia` converts them\nto matrices automatically.\n\n```pycon\n\u003e\u003e\u003e predictions = [[0.1, 0.8, 0.1], [0.2, 0.5, 0.3]]\n\u003e\u003e\u003e outcomes = [2, 3]\n\u003e\u003e\u003e skce(ca.RowVecs(predictions), outcomes)\n-0.10317943453412069\n```\n\n#### Sequences of probability distributions\n\nPredictions can also be provided as sequences of probability distributions defined in the\nJulia package [Distributions.jl](https://github.com/JuliaStats/Distributions.jl). Currently,\nanalytical formulas for the estimators of the SKCE and unnormalized calibration mean embedding\n(UCME) are implemented for uni- and multivariate normal distributions `ca.Normal` and\n`ca.MvNormal` with squared exponential kernels on the target space and Laplace distributions\n`ca.Laplace` with exponential kernels on the target spaca.\n\nIn this example we use the tensor product kernel\n```math\nk((p, y), (p̃, ỹ)) = exp(-W₂(p, p̃)) exp(-(y - ỹ)²/2),\n```\nwhere `W₂(p, p̃)` is the 2-Wasserstein distance of the two normal distributions `p` and `p̃`.\nIt is given by\n```math\nW₂(p, p̃) = √((μ - μ̃)² + (σ - σ̃)²),\n```\nwhere `p = N(μ, σ)` and `p̃ = N(μ̃, σ̃)`.\n\n```pycon\n\u003e\u003e\u003e import random\n\u003e\u003e\u003e random.seed(1234)\n\u003e\u003e\u003e predictions = [ca.Normal(random.gauss(0, 1), random.random()) for _ in range(100)]\n\u003e\u003e\u003e outcomes = [random.gauss(0, 1) for _ in range(100)]\n\u003e\u003e\u003e skce = ca.SKCE(ca.tensor(ca.ExponentialKernel(metric=ca.Wasserstein()), ca.SqExponentialKernel()))\n\u003e\u003e\u003e skce(predictions, outcomes)\n0.02203618235964146\n```\n\n### Calibration tests\n\n`pycalibration` provides different calibration tests that estimate the p-value of the null hypothesis\nthat a model is calibrated, based on a set of predictions and outcomes:\n- `ca.ConsistencyTest` estimates the p-value with consistency resampling for a given calibration error estimator\n- `ca.DistributionFreeSKCETest` computes distribution-free (and therefore usually quite weak) upper bounds of the p-value for different estimators of the SKCE\n- `ca.AsymptoticBlockSKCETest` estimates the p-value based on the asymptotic distribution of the unbiased block estimator of the SKCE\n- `ca.AsymptoticSKCETest` estimates the p-value based on the asymptotic distribution of the unbiased estimator of the SKCE\n- `ca.AsymptoticCMETest` estimates the p-value based on the asymptotic distribution of the UCME\n\n```pycon\n\u003e\u003e\u003e import numpy as np\n\u003e\u003e\u003e rng = np.random.default_rng(1234)\n\u003e\u003e\u003e predictions = rng.dirichlet((3, 2, 5), 100)\n\u003e\u003e\u003e outcomes = rng.integers(low=1, high=4, size=100)\n\u003e\u003e\u003e kernel = ca.tensor(ca.ExponentialKernel(metric=ca.TotalVariation()), ca.WhiteKernel())\n\u003e\u003e\u003e test = ca.AsymptoticSKCETest(kernel, predictions, outcomes)\n\u003e\u003e\u003e print(test)\n\u003cPyCall.jlwrap Asymptotic SKCE test\n--------------------\nPopulation details:\n    parameter of interest:   SKCE\n    value under h_0:         0.0\n    point estimate:          6.07887e-5\n\nTest summary:\n    outcome with 95% confidence: fail to reject h_0\n    one-sided p-value:           0.4330\n\nDetails:\n    test statistic: -4.955380469272125\n\u003e\u003e\u003e ca.pvalue(test)\n0.435\n```\n\n## Citing\n\nIf you use pycalibration as part of your research, teaching, or other activities, please consider citing the following publications:\n\nWidmann, D., Lindsten, F., \u0026 Zachariah, D. (2019). [Calibration tests in multi-class\nclassification: A unifying framework](https://proceedings.neurips.cc/paper/2019/hash/1c336b8080f82bcc2cd2499b4c57261d-Abstract.html). In\n*Advances in Neural Information Processing Systems 32 (NeurIPS 2019)* (pp. 12257–12267).\n\nWidmann, D., Lindsten, F., \u0026 Zachariah, D. (2021).\n[Calibration tests beyond classification](https://openreview.net/forum?id=-bxf89v3Nx).\n*International Conference on Learning Representations (ICLR 2021)*.\n\n## Acknowledgements\n\nThis work was financially supported by the Swedish Research Council via the projects *Learning of Large-Scale Probabilistic Dynamical Models* (contract number: 2016-04278), *Counterfactual Prediction Methods for Heterogeneous Populations* (contract number: 2018-05040), and *Handling Uncertainty in Machine Learning Systems* (contract number: 2020-04122), by the Swedish Foundation for Strategic Research via the project *Probabilistic Modeling and Inference for Machine Learning* (contract number: ICA16-0015), by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation, and by ELLIIT.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevmotion%2Fpycalibration","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdevmotion%2Fpycalibration","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevmotion%2Fpycalibration/lists"}