Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ajmazurie/xstats.MINE
Python wrapper for the Java-based Maximal Information-based Nonparametric Exploration (MINE) statistics library
https://github.com/ajmazurie/xstats.MINE
Last synced: about 2 months ago
JSON representation
Python wrapper for the Java-based Maximal Information-based Nonparametric Exploration (MINE) statistics library
- Host: GitHub
- URL: https://github.com/ajmazurie/xstats.MINE
- Owner: ajmazurie
- License: other
- Created: 2011-12-20T20:57:16.000Z (almost 13 years ago)
- Default Branch: master
- Last Pushed: 2012-02-03T18:41:04.000Z (over 12 years ago)
- Last Synced: 2024-07-19T22:50:03.074Z (2 months ago)
- Language: Python
- Homepage:
- Size: 105 KB
- Stars: 19
- Watchers: 4
- Forks: 9
- Open Issues: 0
-
Metadata Files:
- Readme: README.rst
- License: LICENSE.txt
Awesome Lists containing this project
README
xstats.MINE
===========``xstats.MINE`` is a `Python `_ library wrapping the Maximal Information-based Nonparametric Exploration (`MINE `_) statistical library which is, for now, only available as a Java implementation.
``xstats.MINE`` can be used both with the `Jython `_ interpreter or with Python using `JPype `_.
MINE is a set of statistics that can be used to identify important relationships in datasets and characterize these relationships. Given a relationship between two vectors of scalars, MINE produces the following scores:
- MIC (maximum information coefficient), which captures relationship strength
- MAS (maximum asymmetry score), which captures departure from monotonicity
- MEV (maximum edge value), which captures closeness to being a function
- MCN (minimum cell number), which captures complexityA complete description of MINE and examples of use can be found in the following article: D. Reshef, Y. Reshef, H. Finucane, S. Grossman, G. McVean, P. Turnbaugh, E. Lander, M. Mitzenmacher, P. Sabeti. Detecting novel associations in large datasets. Science 334, 6062 (2011).
**Contact** Aurelien Mazurie
**Keywords** MINE, Statistics, Python, Jython, JPype
Installation
------------Please follow those steps to ensure a proper installation of ``xstats.MINE``; note that step 3 can be skipped if you only intent to use ``xstats.MINE`` with Jython.
1. Installation of MINE.jar
~~~~~~~~~~~~~~~~~~~~~~~~The file **MINE.jar**, which you can retrieve at http://www.exploredata.net/Downloads/MINE-Application must be downloaded in your computer. It is advised to place this file in a stable location; e.g., a directory on your computer dedicated to Java `.jar files `_.
Once downloaded, **MINE.jar** must be made visible from the Java interpreter that lies behind Jython and JPype. It typically means adding the path to this file (wherever you placed it) to the ``CLASSPATH`` environment variable. If you are not familiar with the concept of environment variable, a quick introduction is available `here `_.
Depending of if you are under Windows or a flavor of Unix the technique to modify the ``CLASSPATH`` slightly differs. A good tutorial is available `here `_; simply replaces references to ``PATH`` by references to ``CLASSPATH``.
Please note that this version of ``xstats.MINE`` is compatible with ``MINE.jar`` version 1.0.1b through 1.0.1d.
2. Installation of JPype
~~~~~~~~~~~~~~~~~~~~~~~~If you plan to use ``xstats.MINE`` with Python you need to have JPype installed first. An easy way to do so, if you have **setuptools** installed, is to type ::
easy_install JPype
(see the relevant `documentation `_)
You will also need to download the **commons-io-X.X.jar** file from http://commons.apache.org/io/; X.X is the version of the Commons IO library (2.1 at the time of writing). This file must be declared in your ``CLASSPATH`` the same way you did for **MINE.jar**; see instructions in Step 1.
3. Installation of xstats.MINE
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Finally, to install ``xstats.MINE`` for both Python and Jython please follow those steps:
- Download the latest version of the library from http://github/ajmazurie/xstats.MINE/downloads
- Unzip the downloaded file, and ``cd`` in the resulting directory
- Run ``python setup.py install``To update ``xstats.MINE`` with newer versions just repeat Step 3.
Examples of use
---------------Example #1: MINE on a pair of scalars
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~The method **analyze_pair()** can be used to calculate the various MINE scores on a pair of scalar vectors. For example, ::
import xstats.MINE
x = [40,50,None,70,80,90,100,110,120,130,140,150,
160,170,180,190,200,210,220,230,240,250,260]y = [-0.07,-0.23,-0.1,0.03,-0.04,None,-0.28,-0.44,-0.09,0.12,0.06,
-0.04,0.31,0.59,0.34,-0.28,-0.09,-0.44,0.31,0.03,0.57,0,0.01]print "x y", xstats.MINE.analyze_pair(x, y)
will return the following scores::
{'MCN': 2.5849625999999999,
'MAS': 0.040419996,
'pearson': 0.31553724,
'MIC': 0.38196000000000002,
'MEV': 0.27117000000000002,
'non_linearity': 0.28239626000000001}Example #2: MINE on a file
~~~~~~~~~~~~~~~~~~~~~~~~~~The method **analyze_file()** can be used to calculate the various MINE scores on values read from a comma- or tab-delimited file. The function can consider all pairs of variables in the file, only adjacent variables, or compare all variables in turn against a master variable.
If the input file has a **.csv** extension the function will assume it is a `comma-delimited file `_; if not it assumes it is a tab-delimited file.
For example, analyzing the **Spellman.csv** file which can be found at http://www.exploredata.net/Downloads/Gene-Expression-Data-Set ::
import xstats.MINE
for a, b, scores in xstats.MINE.analyze_file("Spellman.csv", xstats.MINE.MASTER_VARIABLE, 0, cv = 0.7):
print a, b, scoreswill display the following (only the first lines are shown; lines are truncated)::
time YER044C {'MCN': 2.5849625999999999, 'MAS': 0.16225999999999999, ...}
time YNL178W {'MCN': 2.5849625999999999, 'MAS': 0.46802998000000001, ...}
time YCR098C {'MCN': 2.0, 'MAS': 0.0, ...}
time YEL050C {'MCN': 2.0, 'MAS': 0.0, ...}Note that this example replicates the one shown in the MINE documentation (see http://www.exploredata.net/Usage-instructions/Parameters)::
java -jar MINE.jar Spellman.csv 0 cv=0.7
Licensing
---------``xstats.MINE`` is released under a `MIT/X11 license `_.
``MINE.jar`` is released under a `Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported `_ license by its authors.