Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/martinSter/modi
modi
https://github.com/martinSter/modi
Last synced: 3 months ago
JSON representation
modi
- Host: GitHub
- URL: https://github.com/martinSter/modi
- Owner: martinSter
- License: other
- Created: 2018-07-30T08:27:45.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2024-02-07T15:42:20.000Z (9 months ago)
- Last Synced: 2024-05-21T02:12:33.693Z (6 months ago)
- Language: R
- Homepage:
- Size: 548 KB
- Stars: 4
- Watchers: 2
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.Rmd
- License: LICENSE
Awesome Lists containing this project
README
---
output: github_document
---```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
```# modi
The package **modi** provides several functions for multivariate outlier detection and imputation. They may be used when analysing multivariate quantitative survey data. The distribution of such data is often not multivariate normal. Furthermore, the data is often skewed and exhibits features of a semi-continuous distribution. Finally, missing values and non-response is common. The functions provided in **modi** address those problems.
## Overview
The following outlier detection and imputation functions are provided in **modi**:
* `BEM()` is an implementation of the BACON-EEM algorithm to detect outliers under the assumption of a multivariate normal distribution.
* `TRC()` is an implementation of the transformed rank correlation (TRC) algorithm to detect outliers.
* `EAdet()` is an outlier detection method based on the epidemic algorithm (EA).
* `EAimp()` is an imputation method based on the epidemic algorithm (EA).
* `GIMCD()` is an outlier detection method based on non-robust Gaussian imputation (GI) and the highly robust minimum covariance determinant (MCD) algorithm.
* `POEM()` is a nearest neighbor imputation method for outliers and missing values.
* `winsimp()` is an imputation method for outliers and missing values based on winsorization and Gaussian imputation.
* `ER()` is a robust multivariate outlier detection algorithm that can cope with missing values.## Installation
You can install modi from github by running the following line of code in R:
```{r gh-installation, eval = FALSE}
# install.packages("devtools")
devtools::install_github("martinSter/modi")
```## Example
The following simple example shows how the BACON-EEM algorithm can be applied to detect outliers in the Bushfire dataset:
```{r example}
# Bushfire data set with 20% of observations missing completely at random (MCAR)library(modi)
# load the data containing missing values and the corresponding weights
data(bushfirem, bushfire.weights)# run BEM algorithm to detect multivariate outliers
bem.res <- BEM(bushfirem, bushfire.weights, alpha = (1 - 0.01 / nrow(bushfirem)))# show the outliers as detected by BEM
bushfirem[bem.res$outind, ]# show mean per column as computed in BEM
print(bem.res$center)
```## Acknowledgement
The implementation of this R package was supported by the Hasler foundation.