Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/easystats/correlation

:link: Methods for Correlation Analysis
https://github.com/easystats/correlation

bayesian bayesian-correlations biserial cor correlation correlation-analysis correlations easystats gamma gaussian-graphical-models hacktoberfest matrix multilevel-correlations outliers partial partial-correlations r regression robust spearman

Last synced: about 1 month ago
JSON representation

:link: Methods for Correlation Analysis

Awesome Lists containing this project

README

        

---
output: github_document
---

# correlation

```{r README-1, warning=FALSE, message=FALSE, echo=FALSE}
library(ggplot2)
library(poorman)
library(correlation)

options(digits = 2)

knitr::opts_chunk$set(
collapse = TRUE,
dpi = 300,
message = FALSE,
warning = FALSE,
fig.path = "man/figures/"
)
```

[![DOI](https://joss.theoj.org/papers/10.21105/joss.02306/status.svg)](https://doi.org/10.21105/joss.02306) [![total](https://cranlogs.r-pkg.org/badges/grand-total/correlation)](https://cranlogs.r-pkg.org/) [![status](https://tinyverse.netlify.com/badge/correlation)](https://CRAN.R-project.org/package=correlation)

`correlation` is an [**easystats**](https://github.com/easystats/easystats) package focused on correlation analysis. It's lightweight, easy to use, and allows for the computation of many different kinds of correlations, such as **partial** correlations, **Bayesian** correlations, **multilevel** correlations, **polychoric** correlations, **biweight**, **percentage bend** or **Sheperd's Pi** correlations (types of robust correlation), **distance** correlation (a type of non-linear correlation) and more, also allowing for combinations between them (for instance, *Bayesian partial multilevel correlation*).

# Citation

You can cite the package as follows:

Makowski, D., Ben-Shachar, M. S., Patil, I., \& Lüdecke, D. (2020). Methods and algorithms for correlation analysis in R. _Journal of Open Source Software_,
*5*(51), 2306. https://doi.org/10.21105/joss.02306

Makowski, D., Wiernik, B. M., Patil, I., Lüdecke, D., \& Ben-Shachar, M. S. (2022). *correlation*: Methods for correlation analysis [R package]. https://CRAN.R-project.org/package=correlation (Original work published 2020)

# Installation

[![CRAN](http://www.r-pkg.org/badges/version/correlation)](https://cran.r-project.org/package=correlation) [![correlation status badge](https://easystats.r-universe.dev/badges/correlation)](https://easystats.r-universe.dev) [![R-CMD-check](https://github.com/easystats/correlation/workflows/R-CMD-check/badge.svg?branch=main)](https://github.com/easystats/correlation/actions)
[![codecov](https://codecov.io/gh/easystats/correlation/branch/master/graph/badge.svg)](https://app.codecov.io/gh/easystats/correlation)

The *correlation* package is available on CRAN, while its latest development version is available on R-universe (from _rOpenSci_).

Type | Source | Command
---|---|---
Release | CRAN | `install.packages("correlation")`
Development | R-universe | `install.packages("correlation", repos = "https://easystats.r-universe.dev")`

Once you have downloaded the package, you can then load it using:

```{r, eval=FALSE}
library("correlation")
```

> **Tip**
>
> **Instead of `library(correlation)`, use `library(easystats)`.**
> **This will make all features of the easystats-ecosystem available.**
>
> **To stay updated, use `easystats::install_latest()`.**

# Documentation

Check out package [website](https://easystats.github.io/correlation/) for documentation.

# Features

The *correlation* package can compute many different types of correlation,
including:

✅ **Pearson's correlation**

✅ **Spearman's rank correlation**

✅ **Kendall's rank correlation**

✅ **Biweight midcorrelation**

✅ **Distance correlation**

✅ **Percentage bend correlation**

✅ **Shepherd's Pi correlation**

✅ **Blomqvist’s coefficient**

✅ **Hoeffding’s D**

✅ **Gamma correlation**

✅ **Gaussian rank correlation**

✅ **Point-Biserial and biserial correlation**

✅ **Winsorized correlation**

✅ **Polychoric correlation**

✅ **Tetrachoric correlation**

✅ **Multilevel correlation**

An overview and description of these correlations types is [**available here**](https://easystats.github.io/correlation/articles/types.html). Moreover,
many of these correlation types are available as **partial** or within a
**Bayesian** framework.

# Examples

The main function is [`correlation()`](https://easystats.github.io/correlation/reference/correlation.html), which builds on top of [`cor_test()`](https://easystats.github.io/correlation/reference/cor_test.html) and comes with a number of possible options.

## Correlation details and matrix

```{r README-4}
results <- correlation(iris)
results
```

The output is not a square matrix, but a **(tidy) dataframe with all correlations tests per row**. One can also obtain a **matrix** using:

```{r README-5}
summary(results)
```

Note that one can also obtain the full, **square** and redundant matrix using:

```{r README-6}
summary(results, redundant = TRUE)
```

```{r README-7}
library(see)

results %>%
summary(redundant = TRUE) %>%
plot()
```

## Correlation tests

The `cor_test()` function, for pairwise correlations, is also very convenient for making quick scatter plots.

```{r README-corr}
plot(cor_test(iris, "Sepal.Width", "Sepal.Length"))
```

## Grouped dataframes

The `correlation()` function also supports **stratified correlations**, all within the
*tidyverse* workflow!

```{r README-8}
iris %>%
select(Species, Sepal.Length, Sepal.Width, Petal.Width) %>%
group_by(Species) %>%
correlation()
```

## Bayesian Correlations

It is very easy to switch to a **Bayesian framework**.

```{r README-9}
correlation(iris, bayesian = TRUE)
```

## Tetrachoric, Polychoric, Biserial, Biweight...

The `correlation` package also supports different types of methods, which can
deal with correlations **between factors**!

```{r README-10}
correlation(iris, include_factors = TRUE, method = "auto")
```

## Partial Correlations

It also supports **partial correlations** (as well as Bayesian partial correlations).

```{r README-11}
iris %>%
correlation(partial = TRUE) %>%
summary()
```

## Gaussian Graphical Models (GGMs)

Such partial correlations can also be represented as **Gaussian Graphical
Models** (GGM), an increasingly popular tool in psychology. A GGM traditionally
include a set of variables depicted as circles ("nodes"), and a set of lines
that visualize relationships between them, which thickness represents the
strength of association (see [Bhushan et al., 2019](https://www.frontiersin.org/articles/10.3389/fpsyg.2019.01050/full)).

```{r README-12}
library(see) # for plotting
library(ggraph) # needs to be loaded

plot(correlation(mtcars, partial = TRUE)) +
scale_edge_color_continuous(low = "#000004FF", high = "#FCFDBFFF")
```

## Multilevel Correlations

It also provide some cutting-edge methods, such as Multilevel (partial)
correlations. These are are partial correlations based on linear mixed-effects
models that include the factors as **random effects**. They can be see as
correlations *adjusted* for some group (*hierarchical*) variability.

```{r README-13}
iris %>%
correlation(partial = TRUE, multilevel = TRUE) %>%
summary()
```

However, if the `partial` argument is set to `FALSE`, it will try to convert the
partial coefficient into regular ones.These can be **converted back** to full
correlations:

```{r README-14}
iris %>%
correlation(partial = FALSE, multilevel = TRUE) %>%
summary()
```

# Contributing and Support

In case you want to file an issue or contribute in another way to the package, please follow [this guide](https://easystats.github.io/correlation/CONTRIBUTING.html). For questions about the functionality, you may either contact us via email or also file an issue.

# Code of Conduct

Please note that this project is released with a
[Contributor Code of Conduct](https://easystats.github.io/correlation/CODE_OF_CONDUCT.html). By participating in this project you agree to abide by its terms.