Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/dimfalk/kostra2010R

R interface for KOSTRA-DWD-2010R dataset
https://github.com/dimfalk/kostra2010R

dwd kostra precipitation rainfall

Last synced: 2 months ago
JSON representation

R interface for KOSTRA-DWD-2010R dataset

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```

# kostra2010R

[![R-CMD-check](https://github.com/dimfalk/kostra2010R/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/dimfalk/kostra2010R/actions/workflows/R-CMD-check.yaml) [![Codecov test coverage](https://codecov.io/gh/dimfalk/kostra2010R/branch/main/graph/badge.svg)](https://app.codecov.io/gh/dimfalk/kostra2010R?branch=main)

--- As of 01.01.2023, kostra2010R is officially replaced by [kostra2020](https://github.com/dimfalk/kostra2020). ---

The main goal of kostra2010R is to provide access to KOSTRA-DWD-2010R dataset from within R.

Abstract (slightly modified) from the [official dataset description](https://opendata.dwd.de/climate_environment/CDC/grids_germany/return_periods/precipitation/KOSTRA/KOSTRA_DWD_2010R/gis/DESCRIPTION_gridsgermany_return_periods_precipitation_KOSTRA_KOSTRA_DWD_2010R_gis_en.pdf):

> This vector dataset contains statistical precipitation values as a function of duration and return period. The scope of the data is the engineering dimensioning of water management structures. These include, sewerage networks, sewage treatment plants, pumping stations and retention basins. They are also often used for the dimensioning of drainage and infiltration systems. With the help of the data, however, it is also possible to estimate the precipitation level of severe heavy precipitation events with regard to their return periods. This estimation is often used to assess damage events.

> The dataset encompasses values of statistical precipitation (HN) for 18 duration levels D (5 min - 3 days) and 9 return periods Tn (1-100 a) for the whole grid spanning 79 × 107 cells. INDEX_RC describes the unique identifier of a grid cell.

## Installation

You can install the development version of kostra2010R with:

``` r
# install.packages("devtools")
devtools::install_github("dimfalk/kostra2010R")
```

and load the package via

```{r}
library(kostra2010R)
```

## Getting started

### Get "INDEX_RC" based on row and column information

Sometimes identification of grid cells is not accomplished using "INDEX_RC" directly but rather using a combination of X and Y information (e.g. row 49, column 11). This information can easily be used to generate the necessary "INDEX_RC" field.

```{r}
# Generate "INDEX_RC" based on row and column information.
idx_build(row = 42, col = 16)
```

If you wanted to check whether this constructed "INDEX_RC" field is really present in the dataset (or you found an ID in some report and are not sure, if it is still being used), make use of the following function.

```{r}
# Is the following "INDEX_RC" entry present in the dataset?
idx_exists("42016")
```

### Get "INDEX_RC" based on spatial information

The most common use case will be to get the relevant "INDEX_RC" based on coordinates provided, e.g. for the location of a precipitation station in order to be able to classify duration-specific precipitation depths in terms of return periods.

```{r}
# Sf objects created based on specified coordinates. Don't forget to pass the CRS.
p1 <- get_centroid(c(6.09, 50.46), crs = "epsg:4326")
p1

p2 <- get_centroid(c(367773, 5703579), crs = "epsg:25832")
p2
```

For convenience, it is also possible to provide municipality names, postal codes or full addresses to be geocoded via Nominatim API.

```{r}
# Sf objects created based on Nominatim API response. Internet access required!
p3 <- get_centroid("40477")
p3

p4 <- get_centroid("Freiburg im Breisgau")
p4

p5 <- get_centroid("Kronprinzenstr. 24, 45128 Essen")
p5
```

These coordinates can be used subsequently to spatially query the relevant grid index.

```{r}
# Get indices by topological intersection between location point and grid cells.
get_idx(p1)
get_idx(p2)
get_idx(p3)
get_idx(p4)
get_idx(p5)
```

### Construct cell-specific statistics from KOSTRA-DWD-2010R grid

Now that we have messed a little with the grid cell identifiers, let's get a sneak peek into the dataset itself based on the "INDEX_RC" specified.

```{r}
# Build a tibble containing statistical precipitation depths as a function of
# duration and return periods for the grid cell specified.
stats <- get_stats("42016")

stats
```

Some describing attributes have been assigned to the tibble.

```{r}
attr(stats, "id")
attr(stats, "period")
attr(stats, "returnperiods_a")
attr(stats, "source")
```

### Get precipitation depths, calculate precipitation yield

If we now wanted to know the statistical precipitation depth e.g. for an event of 4 hours duration corresponding to a recurrence interval in 1:100 years, it's just a matter of indexing. However, there is a function helping you out.

```{r}
# So we are interested in the rainfall amount [mm] for an event lasting 240 min
# with a return period of 100 a.
get_depth(stats, d = 240, tn = 100)
```

In order to respect statistical uncertainties, as proposed in Malitz & Ertel (2015), ranging between 10 % and 20 % as a function of the chosen return period, make use of `uc = TRUE` to get an interval centered around the single value above.

```{r}
# Same data, but with uncertainties considered.
get_depth(stats, d = 240, tn = 100, uc = TRUE)
```

If you need precipitation yield values [l/(s\*ha)] instead of precipitation depth [mm] or vice versa, make use of the following helper function.

```{r}
as_yield(62.1, d = 240)

as_depth(43.1, d = 240)
```

### Get return periods

Finally, we want to determine the return period according to the dataset for a precipitation depth and duration given.

```{r}
# Let's assume we measured 75.2 mm in 24 h.
get_returnp(stats, hn = 75.2, d = 1440)
```

Accordingly, the approximate corresponding recurrence interval resp. annuality of this event amounts to something between 30 and 50 years as per KOSTRA-DWD-2010R.

The following edge cases are to be mentioned:

```{r}
# 1) In case a class boundary is hit, the return period is replicated.
get_returnp(stats, hn = 38.2, d = 1440)
```

```{r}
# 2) In case the return period tn is smaller than 1, interval opens with 0.
get_returnp(stats, hn = 26.4, d = 1440)
```

```{r}
# 3) In case the return period tn is larger than 100, interval closes with Inf.
get_returnp(stats, hn = 92.8, d = 1440)
```

### Return period interpolation

Although it may be somewhat questionable from a scientific perspective, you might nevertheless be interested in the return period estimated using linear interpolation between adjacent nodes:

```{r}
# Using the same example as above, previously resulting in 30 a < tn < 50 a.
get_returnp(stats, hn = 75.2, d = 1440, interpolate = TRUE)
```

### Further utilization

Data can additionally be visualized as intensity-duration-frequency curves using `plot_idf()`, underpinned by `{ggplot2}` ...

```{r, fig.width = 9}
plot_idf(stats, log10 = TRUE)
```

... or exported to disk using `write_stats()` based on `write.table()`.

## Contributing

See [here](https://github.com/dimfalk/kostra2010R/blob/main/.github/CONTRIBUTING.md) if you'd like to contribute.