https://github.com/const-ae/coefexplainer
Understand How to Interpret the Coefficients of a Categorical Linear Model
https://github.com/const-ae/coefexplainer
Last synced: 9 months ago
JSON representation
Understand How to Interpret the Coefficients of a Categorical Linear Model
- Host: GitHub
- URL: https://github.com/const-ae/coefexplainer
- Owner: const-ae
- License: other
- Created: 2020-08-12T11:05:22.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2022-04-27T14:28:33.000Z (about 4 years ago)
- Last Synced: 2025-09-10T05:35:01.986Z (10 months ago)
- Language: R
- Size: 2.21 MB
- Stars: 10
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.Rmd
- License: LICENSE
Awesome Lists containing this project
README
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%",
fig.width = 12
)
```
# CoefExplainer
Understand How to Interpret the Coefficients of a Categorical Linear Model
## Installation
You can install the released version of CoefExplainer from [Github](https://CRAN.R-project.org) with:
``` r
devtools::install_github("const-ae/CoefExplainer")
```
## Example
Let's demonstrate the package with the `palmerpenguins` package.
First, we have to remove all `NA`'s.
```{r}
peng <- palmerpenguins::penguins
# Remove any NA's
peng <- peng[! apply(peng, 1, function(row) any(is.na(row))), ]
peng
```
We can now load the `CoefExplainer` package and parse a formula for a linear model with categorical covariates:
```{r}
library(CoefExplainer)
coefExplFit <- CoefExplainer(peng, flipper_length_mm ~ species + island + sex)
```
There are three different ways to look at the model:
1. A beeswarm plot for each group (black dots). For each group it shows how the coefficients are combined to arrive at the prediction for that group (blue line) and how that line compares against the true group mean (red line).
```{r}
plotModel(coefExplFit)
```
2. We can also look at the underlying model matrix
```{r}
plotModelMatrix(coefExplFit)
```
3. And lastly, we can look at the magnitude of each coefficient.
```{r}
plotCoef(coefExplFit)
```
# Advanced Example
What happens if we deal with an ordered factor?
```{r}
peng2 <- peng
peng2$bill_length_fct <- cut(peng2$bill_length_mm, breaks = 4, ordered_result = TRUE)
plotAll(CoefExplainer(peng2, flipper_length_mm ~ bill_length_fct))
```
We can use the `C()` function to change the contrast setting for the `bill_length_fct`. Note how the predictions (blue lines)
don't change, however the coefficients have very different interpretations depending on the contrast setting. Do you
recognize which is the default contrast for an ordered factor?
```{r}
peng2$bill_length_fct <- C(peng2$bill_length_fct, contr.treatment)
plotAll(CoefExplainer(peng2, flipper_length_mm ~ bill_length_fct), title = "Treatment Contrast")
peng2$bill_length_fct <- C(peng2$bill_length_fct, contr.sum)
plotAll(CoefExplainer(peng2, flipper_length_mm ~ bill_length_fct), title = "Sum Contrast")
peng2$bill_length_fct <- C(peng2$bill_length_fct, contr.poly)
plotAll(CoefExplainer(peng2, flipper_length_mm ~ bill_length_fct), title = "Polynomial Contrast")
peng2$bill_length_fct <- C(peng2$bill_length_fct, contr.helmert)
plotAll(CoefExplainer(peng2, flipper_length_mm ~ bill_length_fct), title = "Helmert Contrast")
```
I find all the coefficients above difficult to interpret. In my opinion, for ordered factors a better choice is `contr.step()`:
```{r}
contr.step <- function(n){
ret <- matrix(0, nrow = n, ncol = n - 1)
ret[lower.tri(ret)] <- 1
ret
}
peng2$bill_length_fct <- C(peng2$bill_length_fct, contr.step)
plotAll(CoefExplainer(peng2, flipper_length_mm ~ bill_length_fct), title = "Step Contrast")
```
The MASS package provides a similar function called `contr.sdif()` that produces coefficients that have the same values as the ones from the `contr.step()` function. The only difference is the intercept. In the `contr.step()`, the intercept corresponds to the mean of the first group, whereas in the `contr.sdif()` function it is the mean over all group means.
```{r}
peng2$bill_length_fct <- C(peng2$bill_length_fct, MASS::contr.sdif)
plotAll(CoefExplainer(peng2, flipper_length_mm ~ bill_length_fct), title = "MASS Step Contrast")
```
# Credit
If you find the package useful, also checkout the http://www.bioconductor.org/packages/ExploreModelMatrix/ by Charlotte Soneson et al.