Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/gregversteeg/NPEET
Non-parametric Entropy Estimation Toolbox
https://github.com/gregversteeg/NPEET
entropy estimator information-theory python
Last synced: 1 day ago
JSON representation
Non-parametric Entropy Estimation Toolbox
- Host: GitHub
- URL: https://github.com/gregversteeg/NPEET
- Owner: gregversteeg
- License: mit
- Created: 2014-10-10T19:57:02.000Z (about 10 years ago)
- Default Branch: master
- Last Pushed: 2022-10-05T20:58:40.000Z (about 2 years ago)
- Last Synced: 2024-08-03T15:16:43.495Z (3 months ago)
- Topics: entropy, estimator, information-theory, python
- Language: Python
- Size: 304 KB
- Stars: 352
- Watchers: 18
- Forks: 88
- Open Issues: 10
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
NPEET
=====Non-parametric Entropy Estimation Toolbox
This package contains Python code implementing several entropy estimation functions for both discrete and continuous variables. Information theory provides a model-free way find structure in complex systems, but difficulties in estimating these quantities has traditionally made these techniques infeasible. This package attempts to allay these difficulties by making modern state-of-the-art entropy estimation methods accessible in a single easy-to-use python library.
The implementation is very simple. It only requires that numpy/scipy be installed. It includes estimators for entropy, mutual information, and conditional mutual information for both continuous and discrete variables. Additionally it includes a KL Divergence estimator for continuous distributions and mutual information estimator between continuous and discrete variables along with some non-parametric tests for evaluating estimator performance.
**The main documentation is in npeet_doc.pdf.**
It includes description of functions, references, implementation details, and technical discussion about the difficulties in estimating entropies. The code is available here. It requires scipy 0.12 or greater. This package is mainly geared to estimating information-theoretic quantities for continuous variables in a non-parametric way. If your primary interest is in discrete entropy estimation, particularly with undersampled data, please see this package.Example installation and usage:
```bash
git clone https://github.com/gregversteeg/NPEET.git
cd NPEET
pip install .
``````python
>>> from npeet import entropy_estimators as ee
>>> x = [[1.3],[3.7],[5.1],[2.4],[3.4]]
>>> y = [[1.5],[3.32],[5.3],[2.3],[3.3]]
>>> ee.mi(x,y)
Out: 0.168
```Another example:
```python
import numpy as np
from npeet import entropy_estimators as eemy_data = np.genfromtxt('my_file.csv', delimiter=',') # If you look in the documentation, there is a way to skip header rows and other things
x = my_data[:,[5]].tolist()
y = my_data[:,[9]].tolist()
z = my_data[:,[15,17]].tolist()
print(ee.cmi(x, y, z))
print(ee.shuffle_test(ee.cmi, x, y, z, ci=0.95, ns=1000))
```This prints the mutual information between column 5 and 9, conditioned on columns 15 and 17. You can also use the function shuffle_test to return confidence intervals for any estimator. Shuffle_test returns the mean CMI under the null hypothesis (CMI=0), and 95% confidence intervals, estimated using 1000 random permutations of the data.
*Note that we converted the numpy arrays to lists! The current version really works only on python lists (lists of lists actually, as in the first example.*See documentation for references on all implemented estimators.
```latex
@article{kraskov_estimating_2004,
title = {Estimating mutual information},
url = {https://link.aps.org/doi/10.1103/PhysRevE.69.066138},
doi = {10.1103/PhysRevE.69.066138},
journaltitle = {Physical Review E},
author = {Kraskov, Alexander and Stögbauer, Harald and Grassberger, Peter},
date = {2004-06-23},
}@misc{steeg_information-theoretic_2013,
title = {Information-Theoretic Measures of Influence Based on Content Dynamics},
url = {http://arxiv.org/abs/1208.4475},
doi = {10.48550/arXiv.1208.4475},
author = {Steeg, Greg Ver and Galstyan, Aram},
date = {2013-02-15},
}@misc{steeg_information_2011,
title = {Information Transfer in Social Media},
url = {http://arxiv.org/abs/1110.2724},
doi = {10.48550/arXiv.1110.2724},
author = {Steeg, Greg Ver and Galstyan, Aram},
date = {2011-10-12},
}%
```The non-parametric estimators actually fare poorly for variables with strong relationships. See the following paper and the improved code available at
```latex
@misc{gao_efficient_2015,
title = {Efficient Estimation of Mutual Information for Strongly Dependent Variables},
url = {http://arxiv.org/abs/1411.2003},
doi = {10.48550/arXiv.1411.2003},
author = {Gao, Shuyang and Steeg, Greg Ver and Galstyan, Aram},
date = {2015-03-05},
}%
```