https://github.com/friendly/heplots

Visualizing Hypothesis Tests in Multivariate Linear Models, http://friendly.github.io/heplots/
https://github.com/friendly/heplots
linear-hypotheses matrices multivariate-linear-models plot repeated-measure-designs visualizing-hypothesis-tests
Last synced: 4 months ago
JSON representation
Visualizing Hypothesis Tests in Multivariate Linear Models, http://friendly.github.io/heplots/
Host: GitHub
URL: https://github.com/friendly/heplots
Owner: friendly
Created: 2013-10-27T19:10:06.000Z (over 12 years ago)
Default Branch: master
Last Pushed: 2025-03-23T19:18:19.000Z (about 1 year ago)
Last Synced: 2025-04-01T21:55:52.856Z (about 1 year ago)
Topics: linear-hypotheses, matrices, multivariate-linear-models, plot, repeated-measure-designs, visualizing-hypothesis-tests
Language: R
Homepage:
Size: 107 MB
Stars: 9
Watchers: 4
Forks: 5
Open Issues: 3
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md
- Citation: CITATION.cff
Awesome Lists containing this project

README

          ---

output: github_document

editor_options: 

  markdown: 

    wrap: 72

---

```{r, echo = FALSE}

knitr::opts_chunk$set(

  warning = FALSE,   # avoid warnings and messages in the output

  message = FALSE,

  collapse = TRUE,

  fig.width = 4,

  fig.height = 4,

  dpi = 96,

  comment = "#>",

  fig.path = "man/figures/README-"

)

par(mar=c(3,3,1,1)+.1)

```

```{r, echo=FALSE}

library(heplots)

```

[![Lifecycle:

stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)

[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/heplots)](http://cran.r-project.org/package=heplots)

[![R_Universe](https://friendly.r-universe.dev/badges/heplots)](https://friendly.r-universe.dev)

[![Last Commit](https://img.shields.io/github/last-commit/friendly/heplots)](https://github.com/friendly/heplots/)

[![Downloads](http://cranlogs.r-pkg.org/badges/grand-total/heplots)](https://cran.r-project.org/package=heplots)

[![DOI](https://zenodo.org/badge/13908453.svg)](https://zenodo.org/badge/latestdoi/13908453)

[![Docs](https://img.shields.io/badge/pkgdown%20site-blue)](https://friendly.github.io/heplots/)

# heplots 

## **Visualizing Hypothesis Tests in Multivariate Linear Models** 

Version `r getNamespaceVersion("heplots")`; documentation built for `pkgdown` `r Sys.Date()`

## Description 

The `heplots` package provides functions for visualizing hypothesis

tests in multivariate linear models ("MLM" = {MANOVA, multivariate multiple

regression, MANCOVA, and repeated measures designs}). It also provides other tools for

analysis and graphical display of MLMs.

HE plots represent sums-of-squares-and-products matrices for linear

hypotheses (**H**) and for error (**E**) using ellipses (in two

dimensions), ellipsoids (in three dimensions), or by line segments in

one dimension. For the theory and applications, see:

-   [Friendly (2007)](http://datavis.ca/papers/jcgs-heplots.pdf) for the basic theory on which this is based.

-   [Fox, Friendly and Monette (2009)](https://datavis.ca/papers/FoxFriendlyMonette-2009.pdf) for a brief introduction,

-   [Friendly (2010)](http://www.jstatsoft.org/v37/i04/paper) for the application of these ideas to repeated

    measure designs,

-   [Friendly, Monette and Fox (2013)](http://datavis.ca/papers/ellipses-STS402.pdf) for a general discussion of the role of elliptical geometry in statistical understanding, 

-   [Friendly & Sigal (2017)](https://doi.org/10.20982/tqmp.13.1.p020) for an applied R tutorial, 

-   [Friendly & Sigal (2018)](https://www.datavis.ca/papers/EqCov-TAS.pdf) for theory and examples of visualizing equality of covariance matrices.

If you use this work in teaching or research, please cite it as given by `citation("heplots")` or see [Citation](authors.html#citation).

Other topics now addressed here include:

-   robust MLMs, using iteratively re-weighted least squared to

    down-weight observations with large multivariate residuals,

    `robmlm()`.

-   `Mahalanobis()` calculates classical and _robust_ Mahalanobis squared

    distances using MCD and MVE estimators of center and covariance.

-   visualizing tests for equality of covariance matrices in MLMs (Box's

    M test), `boxM()` and `plot.boxM()`. Also: `bartlettTests()` and `LeveneTests()`

    for homogeneity of variance for each response in a MLM.

-   $\chi^2$ Q-Q plots for MLMs (`cqplot()`) to detect outliers and

    assess multivariate normality of residuals.

-   bivariate coefficient plots showing elliptical confidence regions

    (`coefplot()`).

In this respect, the `heplots` package now aims to provide a wide range

of tools for analyzing and visualizing multivariate response linear

models, together with other packages:



-   The related [`candisc`](https://friendly.github.io/candisc/) package

    provides HE plots in **canonical discriminant** space, the space of

    linear combinations of the responses that show the maximum possible

    effects and for canonical correlation in multivariate regression

    designs. See the [package documentation](https://friendly.github.io/candisc/)

    for details.



-   Another package,

    [`mvinfluence`](https://friendly.github.io/mvinfluence/), provides

    diagnostic measures and plots for **influential observations** in MLM

    designs. See the [package documentation](https://friendly.github.io/mvinfluence/)

    for details.

Several tutorial vignettes are also included. See

`vignette(package="heplots")`.

## Installation

+-------------------+----------------------------------------------------------------------------+

| CRAN version      | `install.packages("heplots")`                                              |

+-------------------+----------------------------------------------------------------------------+

| R-universe        | `install.packages("heplots", repos = c('https://friendly.r-universe.dev')` |

+-------------------+----------------------------------------------------------------------------+

| Development       | `remotes::install_github("friendly/heplots")`                              |

| version           |                                                                            |

+-------------------+----------------------------------------------------------------------------+

## HE plot functions

The graphical functions contained here all display multivariate model

effects in variable (**data**) space, for one or more response variables

(or contrasts among response variables in repeated measures designs).

-   `heplot()` constructs two-dimensional HE plots for model terms and

    linear hypotheses for pairs of response variables in multivariate

    linear models.

-   `heplot3d()` constructs analogous 3D plots for triples of response

    variables.

-   The `pairs` method, `pairs.mlm()` constructs a scatterplot matrix of

    pairwise HE plots.

-   `heplot1d()` constructs 1-dimensional analogs of HE plots for model

    terms and linear hypotheses for single response variables.

## Other functions

-   `glance.mlm()` extends `broom::glance.lm()` to multivariate response

    models, giving a one-line statistical summary for each response

    variable. `uniStats()` does something similar, but formatted more like a ANOVA table.

-   `boxM()` Calculates Box's *M* test for homogeneity of covariance

    matrices in a MANOVA design. A `plot` method displays a visual

    representation of the components of the test. Associated with this,

    `bartletTests()` and `levineTests()` give the univariate tests of

    homogeneity of variance for each response measure in a MLM.

-   `covEllipses()` draw covariance (data) ellipses for one or more

    group, optionally including the ellipse for the pooled within-group

    covariance. 

    

-   `coefplot()` for an MLM object draws bivariate confidence ellipses.

### Repeated measure designs

For repeated measure designs, between-subject effects and within-subject

effects must be plotted separately, because the error terms (**E**

matrices) differ. For terms involving within-subject effects, these

functions carry out a linear transformation of the matrix **Y** of

responses to a matrix **Y M**, where **M** is the model matrix for a

term in the intra-subject design and produce plots of the **H** and **E**

matrices in this transformed space. The vignette `"repeated"` describes

these graphical methods for repeated measures designs. (This

paper [HE plots for repeated measures

designs](http://www.jstatsoft.org/v37/i04/paper) is now provided as a PDF vignette.)

## Datasets

The package also provides a large collection of data sets illustrating a

variety of multivariate linear models of the types listed above,

together with graphical displays. The table below classifies these with

method tags. Their names are linked to their documentation with graphical output on the

`pkgdown` website, [].

```{r datasets, echo=FALSE}

library(here)

library(dplyr)

library(tinytable)

#dsets <- read.csv(here::here("extra", "datasets.csv"))

dsets <- read.csv("https://raw.githubusercontent.com/friendly/heplots/master/extra/datasets.csv")

dsets <- dsets |> dplyr::select(-X) |> arrange(tolower(dataset))

# link dataset to pkgdown doc

refurl <- "http://friendly.github.io/heplots/reference/"

dsets <- dsets |>

  mutate(dataset = glue::glue("[{dataset}]({refurl}{dataset}.html)")) 

tinytable::tt(dsets)

#knitr::kable(dsets)

```

## Examples

This example illustrates HE plots using the classic `iris` data set. How

do the means of the flower variables differ by `Species`? This dataset

was the impetus for R. A. Fisher (1936) to propose a method of

discriminant analysis using data collected by Edgar Anderson (1928).

Though some may rightly deprecate Fisher for being a supporter of

eugenics, Anderson's `iris` dataset should not be blamed.

A basic HE plot shows the **H** and **E** ellipses for the first two

response variables (here: `Sepal.Length` and `Sepal.Width`). The

multivariate test is significant (by Roy's test) *iff* the **H** ellipse

projects *anywhere* outside the **E** ellipse.

The positions of the group means show how they differ on the two

response variables shown, and provide an interpretation of the

orientation of the **H** ellipse: it is long in the directions of

differences among the means.

```{r iris1}

#| echo=-1,

#| out.width="70%",

#| fig.cap = "HE plot of sepal length and Sepal width for the iris data"

par(mar=c(4,4,1,1)+.1)

iris.mod <- lm(cbind(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) ~ 

                 Species, data=iris)

heplot(iris.mod)

```

### Contrasts

Contrasts or other linear hypotheses can be shown as well, and the

ellipses look better if they are filled. We create contrasts to test the

differences between `versacolor` and `virginca` and also between

`setosa` and the average of the other two. Each 1 df contrast plots as

degenerate 1D ellipse-- a line.

Because these contrasts are orthogonal, they add to the total 2 df

effect of `Species`. Note how the first contrast, labeled `V:V`,

distinguishes the means of *versicolor* from *virginica*; the second

contrast, `S:VV` distinguishes `setosa` from the other two.

```{r iris2, out.width="70%"}

#| fig.cap = "HE plot of sepal length and Sepal width for the iris data, showing lines reflecting two contrasts among iris species."

par(mar=c(4,4,1,1)+.1)

contrasts(iris$Species)<-matrix(c(0, -1, 1, 

                                  2, -1, -1), nrow=3, ncol=2)

contrasts(iris$Species)

iris.mod <- lm(cbind(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) ~ 

                 Species, data=iris)

hyp <- list("V:V"="Species1","S:VV"="Species2")

heplot(iris.mod, hypotheses=hyp, 

       fill=TRUE, fill.alpha=0.1)

```

### All pairwise HE plots

All pairwise HE plots are produced using the `pairs()` method for MLM

objects.In the plot, note how the means of most pairs of variables are very

highly correlated, in the order Setosa < Versicolor < Virginica, but this

pattern doesn't hold for relations with `Sepal.Width`.

```{r, iris3}

#| out.width="100%",

#| fig.height = 6,

#| fig.width = 6,

#| fig.cap = "Scatterplot matrix of pairwise HE plots for the iris data."

pairs(iris.mod, hypotheses=hyp, hyp.labels=FALSE,

      fill=TRUE, fill.alpha=0.1)

```

### Canonical discriminant view

For more than two response variables, a multivariate effect can be viewed more simply by projecting

the data into canonical space --- the linear combinations of the responses which show the greatest

differences among the group means relative to within-group scatter. The computations are performed

with the [`candisc`](https://github.com/friendly/candisc/) package, which has an `heplot.candisc()`

method.

```{r iris-can0}

library(candisc)

iris.can <- candisc(iris.mod) |> print()

```

The HE plot in canonical space shows that the differences among species are nearly

entirely one-dimensional. The weights for the variables on the first dimension

show how `Sepal.Width` differs from the other size variables.

```{r iris-can}

#| out.width = "60%",

#| echo = -1,

#| fig.cap = "Canonical HE plot for the iris data"

par(mar=c(4,4,1,1)+.1)

# HE plot in canonical space

heplot(iris.can, var.pos = 1, scale = 40)

```

### Covariance ellipses

MANOVA relies on the assumption that within-group covariance matrices are all equal.

It is useful to visualize these in the space of some of the predictors.

`covEllipses()` provides this both for classical and robust (`method="mve"`) estimates.

The figure below shows these for the three Iris species and the 

pooled covariance matrix, which is the same as the **E** matrix used

in MANOVA tests.

```{r iris4, out.width="80%"}

#| echo = -1,

#| fig.cap = "Covariance ellipses for the iris data, showing the classical and robust estimates."

par(mar=c(4,4,1,1)+.1)

covEllipses(iris[,1:4], iris$Species)

covEllipses(iris[,1:4], iris$Species, 

            fill=TRUE, method="mve", add=TRUE, labels="")

```

## References

Anderson, E. (1928). The Problem of Species in the Northern Blue Flags,

Iris versicolor L. and Iris virginica L. *Annals of the Missouri

Botanical Garden*, **13**, 241--313.

Fisher, R. A. (1936). The Use of Multiple Measurements in Taxonomic

Problems. *Annals of Eugenics*, **8**, 379--388.

Friendly, M. (2006).

[Data Ellipses, HE Plots and Reduced-Rank Displays for Multivariate Linear Models: 

SAS Software and Examples.](https://www.jstatsoft.org/article/view/v017i06) 

_Journal of Statistical Software_, **17**, 1-42.

Friendly, M. (2007).  [HE plots for Multivariate General Linear Models](http://datavis.ca/papers/jcgs-heplots.pdf).

_Journal of Computational and Graphical Statistics_, **16**(2) 421-444.

DOI: 10.1198/106186007X208407.

Fox, J., Friendly, M. & Monette, G. (2009). [Visualizing hypothesis

tests in multivariate linear models: The heplots package for R](https://datavis.ca/papers/FoxFriendlyMonette-2009.pdf) *Computational Statistics*, **24**, 233-246.

Friendly, M. (2010). [HE plots for repeated measures

designs](http://www.jstatsoft.org/v37/i04/paper). *Journal of

Statistical Software*, **37**, 1--37.

Friendly, M.; Monette, G. & Fox, J. (2013). [Elliptical Insights:

Understanding Statistical Methods Through Elliptical

Geometry](http://datavis.ca/palers/ellipses-STS402.pdf) *Statistical

Science*, **28**, 1-39.

Friendly, M. & Sigal, M. (2017). [Graphical Methods for Multivariate

Linear Models in Psychological Research: An R

Tutorial.](https://doi.org/10.20982/tqmp.13.1.p020) *The Quantitative

Methods for Psychology*, **13**, 20-45.

Friendly, M. & Sigal, M. (2018): [Visualizing Tests for Equality of

Covariance Matrices](https://www.datavis.ca/papers/EqCov-TAS.pdf), _The American Statistician_, [DOI](https://doi.org/10.1080/00031305.2018.1497537)
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/friendly/heplots

Awesome Lists containing this project

README