https://adriancorrendo.github.io/metrica/

Prediction Performance Metrics
https://adriancorrendo.github.io/metrica/
Last synced: 7 months ago
JSON representation
Prediction Performance Metrics
Host: GitHub
URL: https://adriancorrendo.github.io/metrica/
Owner: adriancorrendo
License: other
Created: 2021-10-07T18:52:35.000Z (about 4 years ago)
Default Branch: master
Last Pushed: 2024-06-30T16:09:43.000Z (over 1 year ago)
Last Synced: 2025-05-03T04:33:37.760Z (7 months ago)
Language: R
Homepage: https://adriancorrendo.github.io/metrica/
Size: 32.6 MB
Stars: 78
Watchers: 6
Forks: 8
Open Issues: 14
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project

AwesomeResponsibleAI - metrica: Prediction performance metrics
awesome-ggplot2 - metrica
README

          ---

output: github_document

---

```{r, include = FALSE}

knitr::opts_chunk$set(

  collapse = TRUE, 

  comment = "#>",

  fig.path = "man/figures/README-",

  out.width = "100%"

)

```

# metrica: Prediction performance metrics.

[![CRAN status](https://www.r-pkg.org/badges/version/metrica)](https://CRAN.R-project.org/package=metrica)

[![CRAN RStudio mirror downloads](https://cranlogs.r-pkg.org/badges/grand-total/metrica?color=blue)](https://r-pkg.org/pkg/metrica)

[![CRAN RStudio mirror downloads](https://cranlogs.r-pkg.org/badges/last-month/metrica?color=blue)](https://r-pkg.org/pkg/metrica) 


[![AppVeyor build status](https://ci.appveyor.com/api/projects/status/github/adriancorrendo/metrica?branch=master&svg=true)](https://ci.appveyor.com/project/adriancorrendo/metrica)

[![R-CMD-check](https://github.com/adriancorrendo/metrica/workflows/R-CMD-check/badge.svg)](https://github.com/adriancorrendo/metrica/actions)

[![codecov](https://codecov.io/gh/adriancorrendo/metrica/branch/master/graph/badge.svg?token=CfK5NhXzYn)](https://app.codecov.io/gh/adriancorrendo/metrica)

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.6474101.svg)](https://doi.org/10.5281/zenodo.6474101)

[![status](https://joss.theoj.org/papers/10.21105/joss.04655/status.svg)](https://joss.theoj.org/papers/10.21105/joss.04655)

[![R-CMD-check](https://github.com/adriancorrendo/metrica/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/adriancorrendo/metrica/actions/workflows/R-CMD-check.yaml)

## Introduction 


 


`metrica` is a compilation of more than 80 functions designed to quantitatively and visually evaluate the prediction performance of regression (continuous variables) and classification (categorical variables) point-forecast models (e.g. APSIM, DSSAT, DNDC, Supervised Machine Learning). `metrica` offers a toolbox with a wide spectrum of goodness of fit, error metrics, indices, and coefficients accounting for different aspects of the agreement between predicted and observed values, plus some basic visualization functions to assess models performance (e.g. confusion matrix, scatter with regression line; Bland-Altman plot) provided in customizable format (ggplot). 

For supervised models, always keep in mind the concept of "cross-validation" since predicted values 

should ideally come from out-of-bag samples (unseen by training sets) to avoid overestimation of the prediction performance. 


Check the Documentation at [https://adriancorrendo.github.io/metrica/](https://adriancorrendo.github.io/metrica/) 


**Vignettes** 


[1. List of metrics for Regression](https://adriancorrendo.github.io/metrica/articles/available_metrics_regression.html)




[2. List of metrics for Classification](https://adriancorrendo.github.io/metrica/articles/available_metrics_classification.html)




[3. A regression case (numerical variables)](https://adriancorrendo.github.io/metrica/articles/regression_case.html)




[4. A classification case (categorical variables)](https://adriancorrendo.github.io/metrica/articles/classification_case.html)




[5. Import files from APSIM](https://adriancorrendo.github.io/metrica/articles/apsim_open.html)




[6. JOSS publication](https://adriancorrendo.github.io/metrica/articles/JOSS_publication.html)




[7. metrica Shiinyapp](https://adriancorrendo.github.io/metrica/articles/Shinyapp.html)




[8. Cheatsheet](https://adriancorrendo.github.io/metrica/articles/Cheatsheet.html)




## Functions 


For regression models, it includes 4 plotting functions (scatter, tiles, density, & Bland-Altman plots), and 48 prediction performance scores including error metrics (MBE, MAE, RAE, RMAE, MAPE, SMAPE, MSE, RMSE, RRMSE, RSR, PBE, iqRMSE), error decomposition (MLA, MLP, PLA, PLP, PAB, PPB, SB, SDSD, LCS, Ub, Uc, Ue), model efficiency (NSE, E1, Erel, KGE), indices of agreement (d, d1, d1r, RAC, AC, lambda), goodness of fit (r, R2, RSS, TSS, RSE), adjusted correlation coefficients (CCC, Xa, distance correlation-dcorr-, maximal information coefficient -MIC-), variability (uSD, var_u), and symmetric regression coefficients (B0_sma, B1_sma). Specifically for time-series predictions, `metrica` also includes the Mean Absolute Scaled Error (MASE). 


For classification (binomial and multinomial) tasks, it includes a

function to visualize the confusion matrix using ggplot2, and 27

functions of prediction scores including: accuracy, error rate,

precision (predictive positive value -ppv-), recall (or true positive

rate-TPR-), specificity (or true negative rate-TNR-, or selectivity),

balanced accuracy (balacc), F-score (fscore), adjusted F-score (agf),

G-mean (gmean), Bookmaker Informedness (bmi, a.k.a. Youden’s J-index -jindex-),

Markedness (deltaP, or mk), Matthews Correlation Coefficient (mcc, a.k.a.

phi-coefficient), Cohen’s Kappa (khat), negative predictive value (npv),

positive and negative likelihood ratios (posLr, negLr), diagnostic odds

ratio (dor), prevalence (preval), prevalence threshold (preval_t),

critical success index (csi, a.k.a. threat score or Jaccard Index -jaccardindex-),

false positive rate (FPR), false negative rate (FNR), false detection

rate (FDR), false omission rate (FOR), and area under the ROC curve

(AUC_roc). 


`metrica` also offers a function (\code{metrics_summary}) that allows users to run all prediction

performance scores at once. The user just needs to specify the type of

model (“regression” or “classification”). 


For more details visit the vignettes .

## Using the functions 


There are two basic arguments common to all `metrica` functions: 

(i) `obs`(Oi; observed, a.k.a. actual, measured, truth, target, label), and 

(ii) `pred` (Pi; predicted, a.k.a. simulated, fitted, modeled,  estimate) values. 


Optional arguments include `data` that allows to call an existing data frame 

containing both observed and predicted vectors, and `tidy`, which controls the 

type of output as a list (tidy = FALSE) or as a data.frame (tidy = TRUE). 


For regression, some specific functions for regression also require to define 

the axis `orientation`. For example, the slope of the symmetric linear regression 

describing the bivariate scatter (SMA). 


For binary classification (two classes), functions also require to check the 

`pos_level` arg., which indicates the alphanumeric order of the "positive level".

Normally, the most common binary denominations are c(0,1), c("Negative", "Positive"), 

c("FALSE", "TRUE"), so the default pos_level = 2 (1, "Positive", "TRUE"). However, 

other cases are also possible, such as c("Crop", "NoCrop") for which the user

needs to specify pos_level = 1. 


For multiclass classification tasks, some functions present the `atom` arg. (logical

TRUE / FALSE), which controls the output to be an overall average estimate across

all classes, or a class-wise estimate. For example, user might be interested in 

obtaining estimates of precision and recall for each possible class of the prediction. 


## 1. Installation

You can install the CRAN version of `metrica` with: 


``` r

install.packages("metrica")

```

You can install the development version from [GitHub](https://github.com/) with:

``` r

# install.packages("devtools")

devtools::install_github("adriancorrendo/metrica")

```

## 2. Native datasets

The *metrica* package comes with four example datasets of continuous variables (regression) from the APSIM software: 


1. `wheat`. 137 data-points of wheat grain N (grams per squared meter) 


2. `barley`. 69 data-points of barley grain number (x1000 grains per 

squared meter) 


3. `sorghum`. 36 data-points of sorghum grain number (x1000 grains per

squared meter) 


4. `chickpea`. 39 data-points of chickpea aboveground dry mass (kg per

hectare) 


These data correspond to the latest, up-to-date, documentation and

validation of version number 2020.03.27.4956. Data available at:

https://doi.org/10.7910/DVN/EJS4M0. Further details can be found at the official APSIM Next Generation website: https://APSIMnextgeneration.netlify.app/modeldocumentation 


In addition, `metrica` also provides two native examples for categorical variables (classification):

5. `land_cover` is a binary dataset of land cover using satellite images obtained in 2022 over a small region in Kansas (USA). Values equal to 1 are associated to vegetation, and values equal to 0 represent other type of land cover. Observed values come from human visualization, while predicted values were obtained with a Random Forest classifier.


6. `maize_phenology` is a data set of maize/corn (*Zea mays* L.) phenology (crop development stage) collected in Kansas (USA) during 2018. The data includes 16 different phenology stages. Observed values were obtained via human visualization, while predicted values were obtained with a Random Forest classifier. 


Any of the above-mentioned data sets can be called with `metrica::name_of_dataset`, for example: 


``` r

metrica::wheat

metrica::land_cover

```

## 3. Example Code

### Libraries

```{r warning=FALSE, message=FALSE}

library(metrica)

library(dplyr)

library(purrr)

library(ggplot2)

library(tidyr)

```

This is a basic example which shows you the core regression and classification functions of *metrica*: 


## 3.1. REGRESSION

```{r warning=FALSE, message=FALSE}

# 1. A. Create a random dataset

# Set seed for reproducibility

set.seed(1)

# Create a random vector (X) with 100 values

X <- rnorm(n = 100, mean = 0, sd = 10)

# Create a second vector (Y) with 100 values by adding error with respect

# to the first vector (X).

Y <- X + rnorm(n=100, mean = 0, sd = 3)

# Merge vectors in a data frame, rename them as synonyms of observed (measured) and predicted (simulated)

example.data <- data.frame(measured = X, simulated = Y)

# 1. B. Or call native example datasets

example.data <- barley %>%  # or 'wheat', 'sorghum', or 'chickpea'

# 1.b. create columns as synonyms of observed (measured) and predicted (simulated)

                mutate(measured = obs, simulated = pred)  

```

### 3.1.1. Plot functions

### 3.1.1.1. Create a customizable scatter plot with PO orientation

```{r warning=FALSE, message=FALSE, fig.height=4, fig.width=5, dpi=90}

barley.scat.plot <- 

  metrica::scatter_plot(data = example.data, 

                        obs = measured, 

                        pred = simulated,

                        orientation = "PO", 

                        print_eq = TRUE,

                        position_eq = c(x=24, y =8), 

                        # Optional arguments to customize the plot

                        shape_type = 21,

                        shape_color = "grey15",

                        shape_fill = "steelblue",

                        shape_size = 3,

                        regline_type = "F1",

                        regline_color = "#9e0059",

                        regline_size = 2)+

  # Customize axis breaks

  scale_y_continuous(breaks = seq(0,30, by = 5))+

  scale_x_continuous(breaks = seq(0,30, by = 5))

barley.scat.plot

# Alternative using vectors instead of dataframe

#metrica::scatter_plot(obs = example.data$obs, pred = example.data$pred)

```

### 3.1.1.2. Create tiles plot with OP orientation

```{r warning=FALSE, message=FALSE, fig.height=4, fig.width=5, dpi=90}

barley.tiles.plot <- 

  tiles_plot(data = example.data, 

                      obs = measured, 

                      pred = simulated,

                      bins = 10, 

                      orientation = "PO",

                      colors = c(low = "pink", high = "steelblue"))

barley.tiles.plot

```

### 3.1.1.3. Create a density plot with OP orientation

```{r warning=FALSE, message=FALSE, fig.height=4, fig.width=5, dpi=90}

barley.density.plot <-

metrica::density_plot(data = example.data, 

                      obs = measured, pred = simulated,

                      n = 5, 

                      orientation = "PO", 

           colors = c(low = "white", high = "steelblue") )+

  theme(legend.position = "none")

barley.density.plot

```

### 3.1.1.4. Create a Bland-Altman plot

```{r warning=FALSE, message=FALSE, fig.height=4, fig.width=5, dpi=90}

barley.ba.plot <- metrica::bland_altman_plot(data = example.data,

                                             obs = measured, pred = simulated)

barley.ba.plot

```

### 3.1.2. Metrics functions

### 3.1.2.2. Single estimates

```{r warning=FALSE, message=FALSE}

# a. Estimate coefficient of determination (R2)

metrica::R2(data = example.data, obs = measured, pred = simulated)

# b. Estimate root mean squared error (RMSE)

metrica::RMSE(data = example.data, obs = measured, pred = simulated)

# c. Estimate mean bias error (MBE)

metrica::MBE(data = example.data, obs = measured, pred = simulated)

# c. Estimate index of agreement (d)

metrica::d(data = example.data, obs = measured, pred = simulated)

# e. Estimate SMA regression intercept (B0)

metrica::B0_sma(data = example.data, obs = measured, pred = simulated, tidy = TRUE)

# f. Estimate SMA regression slope (B1)

metrica::B1_sma(data = example.data, obs = measured, pred = simulated)

```

### 3.1.2.2. Metrics Summary 

```{r warning=FALSE, message=FALSE}

metrics.sum <- metrics_summary(data = example.data, 

                               obs = measured, pred = simulated,

                               type = "regression")  

# Print first 15

head(metrics.sum, n = 15)

# Optional wrangling (WIDE)

metrics.sum.wide <- metrics.sum %>%

  tidyr::pivot_wider(tidyr::everything(),

                      names_from = "Metric",

                      values_from = "Score")

metrics.sum.wide

```

### 3.1.3. Run multiple datasets at once

### 3.1.3.1. Nested data

```{r warning=FALSE, message=FALSE}

# a. Create nested df with the native examples

nested.examples <- bind_rows(list(wheat = metrica::wheat, 

                                  barley = metrica::barley,

                                  sorghum = metrica::sorghum, 

                                  chickpea = metrica::chickpea), 

                             .id = "id") %>%

  dplyr::group_by(id) %>% tidyr::nest() %>% dplyr::ungroup()

head(nested.examples %>% group_by(id) %>% dplyr::slice_head(n=2))

# b. Run 

multiple.sum <- nested.examples %>% 

  # Store metrics in new.column "performance"

  mutate(performance = map(

    data, ~metrica::metrics_summary(data=., obs = obs, pred = pred, 

                                    type = "regression")))

head(multiple.sum)

```

### 3.1.3.2. Non-nested data 


```{r warning=F, message=F}

non_nested_summary <- nested.examples %>% unnest(cols = "data") %>% 

  group_by(id) %>% 

  summarise(metrics_summary(obs = obs, pred = pred, type = "regression")) %>% 

  dplyr::arrange(Metric)

head(non_nested_summary)

```

### 3.1.4. Print metrics in a plot

```{r warning=F, message=F, fig.height=5, fig.width=7, dpi=90}

df <- metrica::wheat

# Create list of selected metrics

selected.metrics <- c("MAE","RMSE", "RRMSE", "R2", "NSE", "KGE", "PLA", "PLP")

df <- metrica::wheat

# Create the plot

plot <- metrica::scatter_plot(data = df, 

                              obs = obs, pred = pred,

                              # Activate print_metrics arg.

                              print_metrics = TRUE, 

                              # Indicate metrics list

                              metrics_list = selected.metrics,

                              # Customize metrics position

                              position_metrics = c(x = 16 , y = 9),

                              # Customize equation position

                              position_eq = c(x = 16.2, y = 9.5))

plot

```

## 3.1. CLASSIFICATION 


### Example datasets 

```{r warning=FALSE, message=FALSE}

binomial_case <- data.frame(labels = sample(c("Pos","Neg"), 100, replace = TRUE),

                            predictions = sample(c("Pos","Neg"), 100, replace = TRUE)) %>% 

  mutate(predictions = as.factor(predictions), labels = as.factor(labels))

multinomial_case <- data.frame(labels = sample(c("Red","Green", "Blue"), 100, replace = TRUE),

                               predictions = sample(c("Red","Green", "Blue"), 100, replace = TRUE) ) %>% 

  mutate(predictions = as.factor(predictions), labels = as.factor(labels))

```

### 3.1.1. Confusion Matrix 


### 3.1.1.1. Binary

```{r warning=FALSE, message=FALSE, fig.height=6, fig.width=7, dpi=90}

# a. Print

binomial_case %>% confusion_matrix(obs = labels, pred = predictions, 

                                            plot = FALSE, colors = c(low="#f9dbbd" , high="#735d78"), 

                                            unit = "count")

# b. Plot

binomial_case %>% confusion_matrix(obs = labels, pred = predictions, 

                                            plot = TRUE, colors = c(low="#f9dbbd" , high="#735d78"), 

                                            unit = "count", print_metrics = TRUE)

```

### 3.1.1.2. Multiclass

```{r warning=FALSE, message=FALSE, fig.height=6, fig.width=7, dpi=90}

# a. Print

multinomial_case %>% confusion_matrix(obs = labels, 

                                      pred = predictions, 

                                      plot = FALSE, colors = c(low="#f9dbbd" , high="#735d78"),

                                      unit = "count")

# b. Plot

multinomial_case %>% confusion_matrix(obs = labels, 

                                      pred = predictions, 

                                      plot = TRUE, colors = c(low="#d3dbbd" , high="#885f78"), 

                                      unit = "count", print_metrics = TRUE)

```

### 3.1.1. Classification Metrics 


### 3.1.1.1. Single dataset 


```{r}

# Get classification metrics one by one

binomial_case %>% accuracy(data = ., obs = labels, pred = predictions, tidy=TRUE)

binomial_case %>% error_rate(data = ., obs = labels, pred = predictions, tidy=TRUE)

binomial_case %>% precision(data = ., obs = labels, pred = predictions, tidy=TRUE)

binomial_case %>% recall(data = ., obs = labels, pred = predictions, atom = F, tidy=TRUE)

binomial_case %>% specificity(data = ., obs = labels, pred = predictions, tidy=TRUE)

binomial_case %>% balacc(data = ., obs = labels, pred = predictions, tidy=TRUE)

binomial_case %>% fscore(data = ., obs = labels, pred = predictions, tidy=TRUE)

binomial_case %>% agf(data = ., obs = labels, pred = predictions, tidy=TRUE)

binomial_case %>% gmean(data = ., obs = labels, pred = predictions, tidy=TRUE)

binomial_case %>% khat(data = ., obs = labels, pred = predictions, tidy=TRUE)

binomial_case %>% mcc(data = ., obs = labels, pred = predictions, tidy=TRUE)

binomial_case %>% fmi(data = ., obs = labels, pred = predictions, tidy=TRUE)

binomial_case %>% posLr(data = ., obs = labels, pred = predictions, tidy=TRUE)

binomial_case %>% negLr(data = ., obs = labels, pred = predictions, tidy=TRUE)

binomial_case %>% dor(data = ., obs = labels, pred = predictions, tidy=TRUE)

# Get all at once with metrics_summary()

binomial_case %>% metrics_summary(data = ., obs = labels, pred = predictions, type = "classification")

# Multinomial

multinomial_case %>% metrics_summary(data = ., obs = labels, pred = predictions, type = "classification")

# Get a selected list at once with metrics_summary()

selected_class_metrics <- c("accuracy", "recall", "fscore")

# Binary

binomial_case %>% metrics_summary(data = ., obs = labels, pred = predictions, type = "classification",

                                  metrics_list = selected_class_metrics)

# Multiclass

multinomial_case %>% metrics_summary(data = ., obs = labels, pred = predictions, type = "classification",

                                  metrics_list = selected_class_metrics)

```

```{r}

multinomial_case %>% accuracy(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% error_rate(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% precision(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% recall(data = ., obs = labels, pred = predictions, atom = F, tidy=TRUE)

multinomial_case %>% specificity(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% balacc(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% fscore(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% agf(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% gmean(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% khat(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% mcc(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% fmi(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% posLr(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% negLr(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% dor(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% deltap(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% csi(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% FPR(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% FNR(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% FDR(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% FOR(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% preval(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% preval_t(data = ., obs = labels, pred = predictions, tidy=TRUE)

multinomial_case %>% AUC_roc(data = ., obs = labels, pred = predictions, tidy=TRUE)

```

## 4. Import data from APSIM

Please, visit the [vignette](https://adriancorrendo.github.io/metrica/articles/apsim_open.html)

  

## 5. Contributing to our package

Thank you for considering contributing to our open-source project. Although we are not directly funded to maintain `metrica`, we care about reproducible science, like you. Thus, all contributions are more than welcome!

There are multiple ways you can contribute to `metrica` such as asking questions, propose ideas, report bugs, improve the vignettes & documentation of functions, as well as contributing with code, of course. 

For comments, suggestions, and bug reports, we highly encourage to use [our GitHub issues section](https://github.com/adriancorrendo/metrica/issues).

To improve the documentation and contribute with code, we encourage to [fork the repo](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork) and use [pull requests](https://docs.github.com/en/get-started/quickstart/contributing-to-projects#making-a-pull-request) to contribute code. 

## 6. Code of Conduct

Please note that the metrica project is released with a [Contributor Code of Conduct](https://adriancorrendo.github.io/metrica/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://adriancorrendo.github.io/metrica/

Awesome Lists containing this project

README