Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ContextLab/hypertools

A Python toolbox for gaining geometric insights into high-dimensional data
https://github.com/ContextLab/hypertools

data-visualization data-wrangling high-dimensional-data python text-vectorization time-series topic-modeling visualization

Last synced: 3 months ago
JSON representation

A Python toolbox for gaining geometric insights into high-dimensional data

Awesome Lists containing this project

README

        

![Hypertools logo](images/hypercube.png)

"_To deal with hyper-planes in a 14 dimensional space, visualize a 3D space and say 'fourteen' very loudly. Everyone does it._" - Geoff Hinton

![Hypertools example](images/hypertools.gif)

## Overview

HyperTools is designed to facilitate
[dimensionality reduction](https://en.wikipedia.org/wiki/Dimensionality_reduction)-based
visual explorations of high-dimensional data. The basic pipeline is
to feed in a high-dimensional dataset (or a series of high-dimensional
datasets) and, in a single function call, reduce the dimensionality of
the dataset(s) and create a plot. The package is built atop many
familiar friends, including [matplotlib](https://matplotlib.org/),
[scikit-learn](http://scikit-learn.org/) and
[seaborn](https://seaborn.pydata.org/). Our package was recently
featured on
[Kaggle's No Free Hunch blog](http://blog.kaggle.com/2017/04/10/exploring-the-structure-of-high-dimensional-data-with-hypertools-in-kaggle-kernels/). For a general overview, you may find [this talk](https://www.youtube.com/watch?v=hb_ER9RGtOM) useful (given as part of the [MIND Summer School](https://summer-mind.github.io) at Dartmouth).

## Try it!

Click the badge to launch a binder instance with example uses:

[![Binder](http://mybinder.org/badge.svg)](http://mybinder.org:/repo/contextlab/hypertools-paper-notebooks)

or

Check the [repo](https://github.com/ContextLab/hypertools-paper-notebooks) of Jupyter notebooks from the HyperTools [paper](https://arxiv.org/abs/1701.08290).

## Installation

To install the latest stable version run:

`pip install hypertools`

To install the latest unstable version directly from GitHub, run:

`pip install -U git+https://github.com/ContextLab/hypertools.git`

Or alternatively, clone the repository to your local machine:

`git clone https://github.com/ContextLab/hypertools.git`

Then, navigate to the folder and type:

`pip install -e .`

(These instructions assume that you have [pip](https://pip.pypa.io/en/stable/installing/) installed on your system)

NOTE: If you have been using the development version of 0.5.0, please clear your
data cache (/Users/yourusername/hypertools_data).

## Requirements

+ python>=3.6
+ PPCA>=0.0.2
+ scikit-learn>=0.24.0
+ pandas>=0.18.0
+ seaborn>=0.8.1
+ matplotlib>=1.5.1
+ scipy>=1.0.0
+ numpy>=1.10.4
+ umap-learn>=0.4.6
+ requests
+ pytest (for development)
+ ffmpeg (for saving animations)

## Documentation

Check out our [readthedocs](http://hypertools.readthedocs.io/en/latest/) page for further documentation, complete API details, and additional examples.

## Citing

We wrote a short JMLR paper about HyperTools, which you can read [here](http://jmlr.org/papers/v18/17-434.html), or you can check out a (longer) preprint [here](https://arxiv.org/abs/1701.08290). We also have a repository with example notebooks from the paper [here](https://github.com/ContextLab/hypertools-paper-notebooks).

Please cite as:

`Heusser AC, Ziman K, Owen LLW, Manning JR (2018) HyperTools: A Python toolbox for gaining geometric insights into high-dimensional data. Journal of Machine Learning Research, 18(152): 1--6.`

Here is a bibtex formatted reference:

```bibtex
@ARTICLE {,
author = {Andrew C. Heusser and Kirsten Ziman and Lucy L. W. Owen and Jeremy R. Manning},
title = {HyperTools: a Python Toolbox for Gaining Geometric Insights into High-Dimensional Data},
journal = {Journal of Machine Learning Research},
year = {2018},
volume = {18},
number = {152},
pages = {1-6},
url = {http://jmlr.org/papers/v18/17-434.html}
}
```

## Contributing

[![Join the chat at https://gitter.im/hypertools/Lobby](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/hypertools/Lobby)

If you'd like to contribute, please first read our [Code of Conduct](https://www.mozilla.org/en-US/about/governance/policies/participation/).

For specific information on how to contribute to the project, please see our [Contributing](https://github.com/ContextLab/hypertools/blob/master/CONTRIBUTING.md) page.
## Testing

[![Build Status](https://travis-ci.org/ContextLab/hypertools.svg?branch=master)](https://travis-ci.org/ContextLab/hypertools)

To test HyperTools, install pytest (`pip install pytest`) and run `pytest` in the HyperTools folder

## Examples

See [here](http://hypertools.readthedocs.io/en/latest/auto_examples/index.html) for more examples.

## Plot

```python
import hypertools as hyp
hyp.plot(list_of_arrays, '.', group=list_of_labels)
```

![Plot example](images/plot.gif)

## Align

```python
import hypertools as hyp
hyp.plot(list_of_arrays, align='hyper')
```

### BEFORE

![Align before example](images/align_before.gif)

### AFTER

![Align after example](images/align_after.gif)

## Cluster

```python
import hypertools as hyp
hyp.plot(array, '.', n_clusters=10)
```

![Cluster Example](images/cluster_example.png)

## Describe

```python
import hypertools as hyp
hyp.tools.describe(list_of_arrays, reduce='PCA', max_dims=14)
```
![Describe Example](images/describe_example.png)