An open API service indexing awesome lists of open source software.

https://github.com/atsyplenkov/tidyhydro

Commonly Used Metrics In Hydrological Modelling
https://github.com/atsyplenkov/tidyhydro

Last synced: 2 months ago
JSON representation

Commonly Used Metrics In Hydrological Modelling

Awesome Lists containing this project

README

        

---
output:
github_document:
html_preview: false
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)

requireNamespace("hydroGOF", quietly = TRUE)
requireNamespace("bench", quietly = TRUE)
```

# tidyhydro

[![R-CMD-check](https://github.com/atsyplenkov/tidyhydro/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/atsyplenkov/tidyhydro/actions/workflows/R-CMD-check.yaml)
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[![CRAN status](https://www.r-pkg.org/badges/version/tidyhydro)](https://CRAN.R-project.org/package=tidyhydro)
![GitHub last commit](https://img.shields.io/github/last-commit/atsyplenkov/tidyhydro)

The `tidyhydro` package provides a set of commonly used metrics in hydrology (such as _NSE_, _KGE_, _pBIAS_) for use within a [`tidymodels`](https://www.tidymodels.org/) infrastructure. Originally inspired by the [`yardstick`](https://github.com/tidymodels/yardstick/tree/main)and [`hydroGOF`](https://github.com/hzambran/hydroGOF) packages, this library is mainly written in C++ and provides a very quick estimation of desired goodness-of-fit criteria.

Additionally, you'll find here a C++ implementation of lesser-known yet powerful metrics used in reports from the United States Geological Survey (USGS) and the National Environmental Monitoring Standards (NEMS) guidelines. Examples include _PRESS_ (Prediction Error Sum of Squares), _SFE_ (Standard Factorial Error), and _MSPE_ (Model Standard Percentage Error) and others. Based on the equations from _Helsel et al._ ([2020](http://pubs.er.usgs.gov/publication/tm4A3)), _Rasmunsen et al._ ([2008](https://pubs.usgs.gov/tm/tm3c4/)), _Hicks et al._ ([2020](https://www.nems.org.nz/documents/suspended-sediment)) and etc. (see functions documentation for details).

## Example
The `tidyhydro` package follows the philosophy of [`yardstick`](https://github.com/tidymodels/yardstick/tree/main) and provides S3 class methods for vectors and data frames. For example, one can estimate `NSE` and `pBIAS` for a data frame like this:

```{r example}
library(yardstick)
library(tidyhydro)
data(solubility_test)

nse(solubility_test, solubility, prediction)

pbias(solubility_test, solubility, prediction)
```

or create a [`metric_set`](https://yardstick.tidymodels.org/reference/metric_set.html) and estimate several parameters at once like this:

```{r metricset}
hydro_metrics <- yardstick::metric_set(nse, pbias)

hydro_metrics(solubility_test, solubility, prediction)
```

We do understand that sometimes one needs a qualitative interpretation of the model; therefore, we populated every function with a `performance` argument. When `performance = TRUE`, the metric interpretation will be returned according to Moriasi et al. ([2015](https://elibrary.asabe.org/abstract.asp?aid=46548&t=3&dabs=Y&redir=&redirType=)).

```{r interpretation}
hydro_metrics(solubility_test, solubility, prediction, performance = TRUE)
```

## Installation

You can install the development version of `tidyhydro` from [GitHub](https://github.com/atsyplenkov/tidyhydro) with:

``` r
# install.packages("devtools")
devtools::install_github("atsyplenkov/tidyhydro")

# OR

# install.packages("pak")
pak::pak("atsyplenkov/tidyhydro")
```

## Benchmarking
Since the package uses `Rcpp` in the background, it performs slightly faster than base R and other R packages:
```{r benchmarking}
x <- runif(10^5)
y <- runif(10^5)

nse <- function(truth, estimate, na_rm = TRUE) {
1 -
(
sum((truth - estimate)^2, na.rm = na_rm) /
sum((truth - mean(truth, na.rm = na_rm))^2, na.rm = na_rm)
)
}

bench::mark(
tidyhydro = tidyhydro::nse_vec(truth = x, estimate = y),
hydroGOF = hydroGOF::NSE(sim = y, obs = x),
baseR = nse(truth = x, estimate = y),
check = TRUE,
relative = TRUE,
iterations = 100L
)
```

## See also
- [`hydroGOF`](https://github.com/hzambran/hydroGOF) - Goodness-of-fit functions for comparison of simulated and observed hydrological time series