An open API service indexing awesome lists of open source software.

https://github.com/synthesized-io/insight

🧿 Metrics & Monitoring of Datasets
https://github.com/synthesized-io/insight

data data-analysis data-science framework insights metrics monitoring python

Last synced: 3 days ago
JSON representation

🧿 Metrics & Monitoring of Datasets

Awesome Lists containing this project

README

        

# 🧿 insight

![GitHub top language](https://img.shields.io/github/languages/top/synthesized-io/insight?color=%2387B&logo=python&logoColor=cfa5ea)
![GitHub](https://img.shields.io/github/license/synthesized-io/insight?color=75b1a4&label=license&logo=Open%20Source%20Initiative&logoColor=%233DD639)
![PyPI - Downloads](https://img.shields.io/pypi/dm/insight?color=%23A25&logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAdCAYAAAG8mV4RAAAACXBIWXMAAAViAAAFYgGsYVycAAAHoUlEQVRIiZ2We2xT9xXHP773+nFt3wQ7zosEnJCYQCAhPAaBPkYbCmKdxIYoWoUQmtbSoa1FqrRKW6d21cSQqmkbVH1M7RhUmwZ0aycNmLYVyoACAQohJIW8yPuBsWMHO3bi2P7tjyR+1IGhHulK957H95zzPb/H1QkhmJa9h/7VJ01/1Dy/X4Sj+iKEEPiG3eIXBz4TovFJkVAIIURr/YGLCDGpfGP/4U+FEEgAXV23xYleXR036oRSX39VvHrkMhe/f3QS/cqV68I37Bai8UkhhECXWgfA9jcOCUNWDs2DIU4+80ehAJw7W99uMOrLdr5/DpOtAEZDAPilx3TK4ECvUBQZpzM/gXJ4q5sS4ydQdTIzRYZM95H6PPv6QTH9noZQ8/x+UTMvl5jeSps7zNFtwZMJ8vy+u8KhGbFlaZTmmACYK/25Tpo2rn3lLwQVGyX5djq9YyybowEgnTp1fh/A1lXlAFj1Ed556h+8s/bXuCdqkZYtdb00OOTnaH37VNFTOW0byFu2BxqvtxzyDbvF9IiG+juFp+GVRBdp7d240bHmqy0/kKhfvXV41y3v2P7G/nvK1jXzOXq+lbXVTtqGYxx/fbMOQJkp8NKlhr7qalfR0UY3AE/UzGPjN5dz0zvBxd4wAMMNu0W23EGCx/r6Lz7y+71ibGxUfHj2VlHti38AoNiRhdNuYdd7n6UlsctNyISTFRj0yncDgVFMJhPnbk1mNtkK8MRApzeSY1F4eoGZ7234XRqQ0tvTJQQQi8XRrCpj4VBGS1Z9hOOb3yYQLyMQK8UbXULJihcnARRFRlVNDAx4EwGn33yWg8evExIGfrb9ce4MdMGUWZM70eRO/P4dYmDA81bGSmxubmu86x5IzFUIIdpavozMtGr/7xhT5Ye/POgRRtVWVzX3x1s3rn53Wv9AgJ+8+afGz3vuVTk0I5tWlnPg5E2WVDipKMzu2L1lTfkDAXbv/bC9+W64zFWQhdGazcWuYLKaNXqeLrniza3Z45C+Gnjq1Pl9XV3dMccsa5knMM6FtrssdjnTfCLxLAxSMCdwfafIAFi4YM5LWZpJMuqTi/TIuVYA1lc4AJAUI9m6JiBlKV+71uRbuLB01vvHvuDjyz2J4G2PLeRy/xiB8AT/bvFMUzdZCbYkQGlJ/qzu7n6a+0cAWDEvhyafnr81+TL4sZsmeTPgSwK47/pwOosSTnpFnxa0vsLGruqPKTS1MihtgzjJFi5dauib7ypiLBwiP9uM3aIihI6lxVbsZj17Vu3NqCKGyp14Hcrtjg5RWTmP6MQ4AHdGQrQMjLDcaafZM1nqnlXJwL7IBmRVj1x1jPxAAMluz0oEe72jyRYMxsS7N1qDJ7YUgOIFGxIkAuh8w+60lRQOj9HnDvLPKx0sKZ/DprVL4EZd0kG2MsgzjJvX9ZeUFBQrs2y5urbWTk9OjjlHkiaXhcvpIDfbRElpaVrfQ5FHwp6JmnWLazefTyhTd9bVq7fWtLW2CN+wW3R39wghhPD5/KK5/ui+++3GGZVT2/qD+9m+1nZ+kOx9+6N9Jov6I89oVHe8oSdjezg0I49WFscK7Vr9C5sffyTV9rUK+Pm+I00hoVvUOniPvuHMI3T1gtkU583i1p0wbe5wQr+jNj/0Ldfdp5xVWxIzmPFaeZCcOVP/KYLKU81Dafr1y+dx5vZkMdfuxAmK9OQAqjRsdoZf/pxLL3NP+gYBuVY8VAG3b7cLIXRkZVlZuXIx/23uz/CZiMHff/ptjl3p5IP/NCeT6iUW5Ftw2lX0ehVUFwBZ+NHF23T3HcGJE6fXaJr5Nzk52dW2WWb19OV2LrcOMhSM8OXAPVa78nFYLQyMBLnZP8yqhXPpHYnhyrWSrco0DwUyGHju0SKeK30t8R2Il2WO4NzZ+nbX/OKy8rLZaFlmbDYbO/b+lZaBERYVZWM2KNSU2DGqJrwxmaGIgbjZQUm+nQvdffT5J08FV546Y2MxVLwT1YTiheil0cwCotG4qa/Xg8FooNBqTbNNH7XLnXYu9o4D4zMmmZYyh8qiAiOLcv1sKv49g7ptyCJMnr7+/gzIipRfVjYbAL/fTzgUwaEZaUnxmTznkqNT9RKaQWJnrZna/MtUZp0jEC9DkzrSwXVpYQAoV6829NjttqLcXJskyzKqyZAwalYLmtWCJ5DsdLUrn/KCbEpnG/hk6ucrPBFHiDF+4PrtfdmIoTI+IdMjtiJnr6Bo3nIIBFAKC3LmqKqJicg4wUgEdLIYHR3TWSymGYEutN1BAA0e3Yx2gIiw4YtX4Z2oIU9/HrM0iEwYo0GmsPKFdAZSP4wGA9HoRFry0dExNi6Zi8PqoWVweIoNKcHl0jkalQUa8/OSzBl0PmzSDTSlA0zl4DoJ3a9B8rYiFosRCASF4vOHg9GoMFssRmn6NkkVi8XEhtpyvvPE4oTO6x2hrLw8zS/1/y9NxtoJNW7HG6kOBuWiYCEUAsiyzOzZhSiVlfO1VP/a2lyamtoumkxKpaYZNKPBkIFpSLlsUyWGij9aEfdF5zePx3PeXbRqy7sA5qlnJnnou6Ch4dYu1WR41WxWCuNxIeXm5eL1BryhULihoqJk3UOBzCD/A85d108RiuEkAAAAAElFTkSuQmCC)
![GitHub Repo stars](https://img.shields.io/github/stars/synthesized-io/insight?logo=github&logoColor=%23333&style=social)

A python package to quickly **understand**, **assess**, and **compare** pandas `Series`/`DataFrames`.

The predominant functions in the package focus on easy-to-use **metrics** and intelligent **plotting functions**. The metrics can also be configured from YAML to allow for simple to configure benchmarking and assessment scripts.

## [![PyPI](https://img.shields.io/pypi/v/insight?color=%23DA5&label=latest&logo=semver)](https://pypi.org/project/insight/) [![CodeQL Status](https://img.shields.io/github/workflow/status/synthesized-io/insight/CodeQL?color=%236ac&label=CodeQL&logo=GitHub%20Actions&logoColor=%236ac)](https://github.com/synthesized-io/insight/actions/workflows/code-ql.yml) [![CI Status](https://img.shields.io/github/workflow/status/synthesized-io/insight/CI?label=CI&logo=GitHub%20Actions&logoColor=%23afb)](https://github.com/synthesized-io/insight/actions/workflows/CI.yaml) [![Coverage](https://sonarcloud.io/api/project_badges/measure?project=synthesized-io_synthesized-insight&metric=coverage)](https://sonarcloud.io/summary/new_code?id=synthesized-io_synthesized-insight) [![Code Smells](https://sonarcloud.io/api/project_badges/measure?project=synthesized-io_synthesized-insight&metric=code_smells)](https://sonarcloud.io/summary/new_code?id=synthesized-io_synthesized-insight) [![pre-commit.ci status](https://results.pre-commit.ci/badge/github/synthesized-io/insight/master.svg)](https://results.pre-commit.ci/latest/github/synthesized-io/insight/master)

## Installation

```shell
pip install insight
```

## Usage

### Metrics

At the core of insight are the metrics classes which can be evaluated on one series, two series,
one dataframe or two dataframes.

```pycon
>>> import insight.metrics as m
>>> metric = m.EarthMoversDistance()
>>> metric(df['A'], df['B'])
0.14
```

### Plotting

The package provides various plotting functions which allow you to easily explore any series, dataframe
or multiple dataframes.

```pycon
>>> import insight.plotting as p
>>> p.plot_dataset([df1, df2])
```

### Migrations

`insight` populates the results to the Postgres database configured by environment variables. To run migrations against it, simply:

```bash
insight-migrations
```


distribution plots