Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/data-describe/data-describe

data⎰describe: Pythonic EDA Accelerator for Data Science
https://github.com/data-describe/data-describe

analysis data-science eda exploratory-data-analysis pypi

Last synced: 1 day ago
JSON representation

data⎰describe: Pythonic EDA Accelerator for Data Science

Awesome Lists containing this project

README

        

[![PyPI status](https://img.shields.io/pypi/status/data-describe.svg)](https://pypi.python.org/pypi/data-describe/)
[![PyPI license](https://img.shields.io/pypi/l/data-describe.svg)](https://pypi.python.org/pypi/data-describe/)
[![Downloads](https://pepy.tech/badge/data-describe/month)](https://pepy.tech/project/data-describe/month)

[![PyPI version shields.io](https://img.shields.io/pypi/v/data-describe.svg)](https://pypi.python.org/pypi/data-describe/)
[![PyPI pyversions](https://img.shields.io/pypi/pyversions/data-describe.svg)](https://pypi.python.org/pypi/data-describe/)
[![codecov](https://codecov.io/gh/data-describe/data-describe/branch/master/graph/badge.svg?token=CY0M5NAMXH)](undefined)
# data ⎰ describe

[data-describe](https://data-describe.ai/) is a Python toolkit for Exploratory Data Analysis (EDA). It aims to accelerate data exploration and analysis by providing automated and polished analysis widgets.

For more examples of data-describe in action, see the [Quick Start Tutorial](https://data-describe.ai/docs/master/_notebooks/quick_start.html).

## Main Features

data-describe implements the following basic features:

| Feature | Description |
| ----------- | ----------- |
| Data Summary | Curated data summary |
| Data Heatmap | Data variation and missingness heatmap |
| Correlation Matrix | Correlation heatmaps with categorical support |
| Distribution Plots | Generate histograms, violin plots, bar charts |
| Scatterplots | Generate scatterplots and evaluate with scatterplot diagnostics |
| Cluster Analysis | Automated clustering and plotting |
| Feature Ranking | Evaluate feature importance using tree models |

## Extended Features

data-describe is always looking to elevate the standard for Exploratory Data Analysis. Here are just a few that are implemented:

* Dimensionality Reduction Methods
* Sensitive Data (PII) Redaction
* Text Pre-processing / Topic Modeling
* Big Data Support

## Installation

data-describe can be installed using pip:

```
pip install data-describe
```

## Getting Started

```python
import data_describe as dd
help(dd)
```

See the [User Guide](https://data-describe.ai/docs/master/_notebooks/user_guide.html) for more information.

## Project Status

data-describe is currently in **beta** status.

## Contributing

data-describe welcomes [contributions from the community](./CONTRIBUTING.md).