{"id":48415392,"url":"https://github.com/paulbrodersen/entropy_estimators","last_synced_at":"2026-04-06T07:07:22.294Z","repository":{"id":45708759,"uuid":"53587540","full_name":"paulbrodersen/entropy_estimators","owner":"paulbrodersen","description":"Estimators for the entropy and other information theoretic quantities of continuous distributions","archived":false,"fork":false,"pushed_at":"2024-05-14T11:09:29.000Z","size":53,"stargazers_count":145,"open_issues_count":1,"forks_count":28,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-09-27T21:14:48.576Z","etag":null,"topics":["density-estimation","entropy"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/paulbrodersen.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-03-10T13:44:27.000Z","updated_at":"2025-09-20T08:18:20.000Z","dependencies_parsed_at":"2023-11-29T11:57:44.494Z","dependency_job_id":null,"html_url":"https://github.com/paulbrodersen/entropy_estimators","commit_stats":{"total_commits":31,"total_committers":2,"mean_commits":15.5,"dds":0.09677419354838712,"last_synced_commit":"789863a1cf327fb4e2bfdcc100ab67a1f6755ba2"},"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/paulbrodersen/entropy_estimators","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paulbrodersen%2Fentropy_estimators","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paulbrodersen%2Fentropy_estimators/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paulbrodersen%2Fentropy_estimators/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paulbrodersen%2Fentropy_estimators/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/paulbrodersen","download_url":"https://codeload.github.com/paulbrodersen/entropy_estimators/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paulbrodersen%2Fentropy_estimators/sbom","scorecard":{"id":722935,"data":{"date":"2025-08-11","repo":{"name":"github.com/paulbrodersen/entropy_estimators","commit":"f724b376cfcc70f8e121330f8ab26f9eba3b0241"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3.2,"checks":[{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Code-Review","score":1,"reason":"Found 3/24 approved changesets -- score normalized to 1","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: GNU General Public License v3.0: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 9 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-22T11:54:51.807Z","repository_id":45708759,"created_at":"2025-08-22T11:54:51.807Z","updated_at":"2025-08-22T11:54:51.807Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31463018,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-05T21:22:52.476Z","status":"online","status_checked_at":"2026-04-06T02:00:07.287Z","response_time":112,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["density-estimation","entropy"],"created_at":"2026-04-06T07:07:21.603Z","updated_at":"2026-04-06T07:07:22.289Z","avatar_url":"https://github.com/paulbrodersen.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003e [!CAUTION]\n\u003e This package implements the KL-estimator and the KSG-estimator for the entropy and mutual information of continuous variables (and a few variants thereof).\n\u003e These estimators have issues, and for most purposes, should likely no longer be the first choice.\n\u003e Please read [this comment](https://github.com/paulbrodersen/entropy_estimators/issues/11#issuecomment-2109577671) for a better explanation than I could give, as well as a link to an alternative implementation.\n\n# Entropy estimators\n\nThis module implements estimators for the entropy and other\ninformation theoretic quantities of continuous distributions, including:\n\n* entropy / Shannon information (`get_h`),\n* mutual information (`get_mi`),\n* partial mutual information \u0026 transfer entropy (`get_pmi`),\n* specific information (`get_imin`), and\n* partial information decomposition (`get_pid`).\n\nThe estimators derive from the\n[Kozachenko and Leonenko (1987)](https://www.mathnet.ru/php/archive.phtml?wshow=paper\u0026jrnid=ppi\u0026paperid=797\u0026option_lang=eng)\nestimator, which uses k-nearest neighbour distances to compute the entropy of distributions, and extension thereof developed by\n[Kraskov et al. (2004)](https://arxiv.org/abs/cond-mat/0305641),\nand\n[Frenzel and Pombe (2007)](https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.99.204101).\n\nFor **multivariate normal distributions**, the following quantities can be computed analytically from the covariance matrix.\n\n* entropy (`get_h_mvn`),\n* mutual information (`get_mi_mvn`), and\n* partial mutual information \u0026 transfer entropy (`get_pmi_mvn`).\n\n\n## Installation\n\nEasiest via pip:\n\n``` shell\npip install entropy_estimators\n```\n\n## Examples\n\n```python\n\nimport numpy as np\nfrom entropy_estimators import continuous\n\n# create some normal test data\nX = np.random.randn(10000, 2)\n\n# compute the entropy from the determinant of the multivariate normal distribution:\nanalytic = continuous.get_h_mvn(X)\n\n# compute the entropy using the k-nearest neighbour approach\n# developed by Kozachenko and Leonenko (1987):\nkozachenko = continuous.get_h(X, k=5)\n\nprint(f\"analytic result: {analytic:.5f}\")\nprint(f\"K-L estimator: {kozachenko:.5f}\")\n\n```\n\n## Frequently asked questions\n\n#### Why is the estimate of the mutual information negative? Shouldn't it always be positive?\n\nMutual information is a strictly positive quantity. However, its *estimate* need not be, and in fact, the nearest neighbour estimators are known to be biased estimators (Kraskov et al. 2004). Unfortunately, the bias appears to depend on multiple factors, primarily the number of samples and the choice of the `k` parameter, and thus cannot be known *a priori*. However, the bias itself can be estimated using a straightforward permutation / bootstrap approach:\n\n1. Compute the mutual information estimate between two variables, X and Y.\n2. Permute either variable (or both), and re-compute the estimate. The mutual information between randomised variables is zero, so this estimate represents the bias.\n3. Repeat the previous step many times to obtain a robust estimate of the bias.\n\n\n``` python\nimport numpy as np\n\nfrom scipy.stats import multivariate_normal\nfrom entropy_estimators import continuous\n\n# create two variables with a mutual information that can be computed analytically\nmeans = [0, 1]\ncovariance = np.array([[1, 0.5], [0.5, 1]])\n\ndef get_entropy(covariance):\n    \"\"\"Compute the entropy of multivariate normal distribution from the covariance matrix.\"\"\"\n    if np.size(covariance) \u003e 1:\n        dim = covariance.shape[0]\n        det = np.linalg.det(covariance)\n    else: # scalar\n        dim = 1\n        det = covariance\n    return 0.5 * np.log((2 * np.pi * np.e)**dim * det)\n\nhx  = get_entropy(covariance[0, 0])\nhy  = get_entropy(covariance[1, 1])\nhxy = get_entropy(covariance)\nanalytic_result = hx + hy - hxy\n\n# compute the mutual information from samples using the KSG estimator\ndistribution = multivariate_normal(means, covariance)\nX, Y = distribution.rvs(1000).T\n\nk = 5\nksg_estimate = continuous.get_mi(X, Y, k=k)\n\nprint(f\"Analytic result: {analytic_result:.3f} nats\")\nprint(f\"KSG estimate: {ksg_estimate:.3f} nats\")\nprint(f\"Difference: {analytic_result - ksg_estimate:.3f} nats\")\n# Analytic result: 0.144\n# KSG estimate: 0.113 nats\n# Difference: 0.031 nats\n\n# bootstrap to determine the bias\ntotal_repeats = 100\nbias = 0\nY_shuffled = Y.copy()\nfor ii in range(total_repeats):\n    np.random.shuffle(Y_shuffled) # shuffling occurs in-place!\n    bias += continuous.get_mi(X, Y_shuffled, k=k)\nbias /= total_repeats\n\nprint(\"--------------------------------------------------------------------------------\")\nprint(f\"Bias estimat: {bias:.3f} nats\")\nprint(f\"Corrected KSG estimate: {ksg_estimate - bias:.3f}\")\nprint(f\"Difference to analytic result: {analytic_result - (ksg_estimate - bias):.3f} nats\")\n# Bias estimat: -0.020 nats\n# Corrected KSG estimate: 0.132\n# Difference to analytic result: 0.012 nats\n```\n\n## Alternative Implementations\n\n### Scipy\n\n[`scipy.stats.entropy`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.entropy.html) : entropy of a categorical variable\n\n### Scikit-learn\n\n * [`sklearn.metrics.mutual_info_score`](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mutual_info_score.html#sklearn.metrics.mutual_info_score) : mutual information between two categorical variables\n\n * [`skelarn.metrics.mutual_info_regression`](https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.mutual_info_regression.html) :\n mutual information between two continuous variables; note that their implementation does not report negative mutual information scores and thus makes it impossible to compute bias corrections using the bootstrap approach outlined above.\n\n### Non-parametric Entropy Estimation Toolbox (NPEET)\n\nAlternative python implementations of the nearest-neighbour estimators for the entropy of continuous variables, the mutual information and the partial/conditioned mutual information ([link](https://github.com/gregversteeg/NPEET)). In principle, there are no major differences between their implementation and this repository. However, for large samples, their implementation may run a little slower as it uses lists as the primary data structure and doesn't support parallelisation. The implementation in this repository mostly uses numpy arrays, which allows vectorization of many calculations, and supports running operations on multiple cores by setting the `workers` argument to valus larger than one.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpaulbrodersen%2Fentropy_estimators","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpaulbrodersen%2Fentropy_estimators","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpaulbrodersen%2Fentropy_estimators/lists"}