An open API service indexing awesome lists of open source software.

https://github.com/macarro/imputena

Python package that allows both automated and customized treatment of missing values in datasets
https://github.com/macarro/imputena

imputation missing-data python

Last synced: 12 days ago
JSON representation

Python package that allows both automated and customized treatment of missing values in datasets

Awesome Lists containing this project

README

          

# imputena: impute missing values using Python

[![Build Status](https://travis-ci.com/macarro/imputena.svg?branch=master)](https://travis-ci.com/macarro/imputena)
[![Documentation Status](https://readthedocs.org/projects/imputena/badge/?version=latest)](https://imputena.readthedocs.io/en/latest/?badge=latest)
[![Coverage Status](https://coveralls.io/repos/github/macarro/imputena/badge.svg?branch=master)](https://coveralls.io/github/macarro/imputena?branch=master)
[![PyPI](https://img.shields.io/pypi/v/imputena)](https://pypi.org/project/imputena)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/imputena)](https://pypi.org/project/imputena)

This Python package allows both automated and customized treatment of missing
values in datasets. The treatments that are implemented are:

* Listwise deletion
* Pairwise deletion
* Dropping variables
* Random sample imputation
* Random hot-deck imputation
* LOCF
* NOCB
* Most frequent substitution
* Mean and median substitution
* Constant value imputation
* Random value imputation
* Interpolation
* Interpolation with seasonal adjustment
* Linear regression imputation
* Stochastic regression imputation
* Logistic regression imputation
* K-nearest neighbors imputation
* Sequential regression multiple imputation
* Multiple imputation by chained equations

All these treatments can be applied to whole datasets or parts of them and
allow for extensive customization. The package can also recommend a
treatment for a given dataset, inform about the treatments that are
applicable to it, and automatically apply the best treatment.

## Installation

### Most recent release

To install or update to the most recently published release, run:

```ShellSession
pip install imputena
```

This will fetch the release from PyPi and install it with all dependencies.

### Current development version

Clone this repository or download and unzip it. At the project root directory,
run:

```ShellSession
pip install .
```

## Documentation

### View online

The documentation for the latest version is available at
[imputena.readthedocs.io](https://imputena.readthedocs.io/en/latest).

### Generate locally

The documentation is generated by sphinx using the docstrings. To do so, run
either of the following commands at the `docs` directory:

```ShellSession
make html
make latexpdf
```

The generated documentation will be located in `docs/build`.

## Tests

The tests for the implemented functions are located in the `test` directory and
use the unittest package.

To execute all tests, run the following command at the project root directory:

```ShellSession
python -m unittest
```

To execute only the tests contained in a particular test class, for example
`deletion/test_delete_listwise.py`, run the following command at the
project root directory:

```ShellSession
python -m unittest test.deletion.test_delete_listwise
```