An open API service indexing awesome lists of open source software.

https://github.com/MHaringa/spatialrisk

R-package for spatial risk calculations
https://github.com/MHaringa/spatialrisk

actuarial-science insurance r solvency-ii spatial

Last synced: 3 months ago
JSON representation

R-package for spatial risk calculations

Awesome Lists containing this project

README

          

---
output: github_document
always_allow_html: yes
---

```{r, include = FALSE}
knitr::opts_chunk$set(
fig.path = "man/figures/"
)
```

# spatialrisk

[![CRAN Status](https://www.r-pkg.org/badges/version/spatialrisk)](https://cran.r-project.org/package=spatialrisk) [![Downloads](https://cranlogs.r-pkg.org/badges/spatialrisk?color=blue)](https://cran.r-project.org/package=spatialrisk)

`spatialrisk` is specifically designed for efficient spatial risk calculations, allowing users to quickly sum all observations within a defined radius. With key functions implemented in C++ using Rcpp, the package ensures fast performance for various spatial analysis tasks, including optimizing location and resource allocation.

## Installation

Install `spatialrisk` from CRAN:

```{r, eval = FALSE}
#| fig.alt: >
#| Install packages

install.packages("spatialrisk")
```

Or the development version from GitHub:

```{r gh-installation, eval = FALSE}
#| fig.alt: >
#| Install remote packages

# install.packages("remotes")
remotes::install_github("MHaringa/spatialrisk")
```

```{r, include = FALSE}
#| fig.alt: >
#| Load libraries

library(htmltools)
library(htmlwidgets)
library(webshot)
# install_phantomjs(version = "2.1.1", baseURL = "https://github.com/wch/webshot/releases/download/v0.3.1/")
```

## Example 1: Observations Within a Defined Radius

Filter all observations in `Groningen` located within a 100-meter radius circle centered at coordinates `(longitude, latitude) = (6.561561, 53.21326)`:

```{r example, eval = TRUE, message = FALSE, warning = FALSE}
#| fig.alt: >
#| Determine points in circle

library(spatialrisk)
circle <- points_in_circle(Groningen, lon_center = 6.571561,
lat_center = 53.21326, radius = 100)
circle
```

The sum of all observations within this circle is equal to:

```{r}
#| fig.alt: >
#| Calculate the sum of the points in circle

sum(circle$amount)

```

Next, calculate the sum of all observations within a specified radius. `concentration()` computes the total sum of all observations within a circle of a given radius for multiple locations. For each row in `df`, it finds the sum of all observations in `Groningen` that fall within a 100-meter radius from the specified (lon, lat) coordinates.

```{r example2, eval = TRUE}
#| fig.alt: >
#| Calculate the concentration

df <- data.frame(location = c("p1", "p2", "p3"),
lon = c(6.561561, 6.561398, 6.571561),
lat = c(53.21369, 53.21326, 53.21326))

conc <- concentration(df, Groningen, value = amount, radius = 100)
conc

```

Show that result is indeed equal to the result from the previous example:

```{r}
#| fig.alt: >
#| Check whether outcome is the same

isTRUE(sum(circle$amount) == conc$concentration[3])

```

## Example 2: Find Highest Concentrations

`spatialrisk` offers a fast and effective way to identify areas with the highest concentration of fire risks within a 200-meter radius—meeting a key requirement of the European Directive 2009/138 (Solvency II) and aligned with the principles of the Maximal Covering Location Problem (MCLP). While the directive doesn't mandate a specific method, `spatialrisk` provides a clear and systematic approach to mapping high-risk clusters. This helps insurers accurately assess their maximum exposure to catastrophic fire events and adjust their solvency capital requirements in line with Solvency II standards.

Under Solvency II, the fire risk challenge involves calculating the capital requirement for fire-related events—such as fires, explosions, and terrorist attacks—by identifying the largest concentration of insured assets at risk. This can be approached as a location optimization problem: finding the optimal center of a fixed-radius circle that captures the maximum total value of insured fire risks within that area. The structure of this problem closely mirrors the **Maximal Covering Location Problem (MCLP)**, a well-established model in operations research used to maximize coverage within a limited range.

**The Maximal Covering Location Problem (MCLP)**

Originally proposed by Church and ReVelle in 1974, the Maximal Covering Location Problem (MCLP) focuses on placing facilities in locations that maximize coverage of demand points within a predefined distance. In this context:

1. **Demand points** represent locations that require coverage, such as populations, assets, or infrastructure.
2. **Facilities** are placed to cover as many of these points as possible, within a fixed coverage radius.
3. The goal is to **maximize total demand covered** using a limited number of facilities and a set coverage distance.

**Strategic Rationale for Applying MCLP to Fire Risk Assessment**

The Solvency II directive requires insurers to assess their capital exposure to fire risk by identifying the highest concentration of insured assets within a 200-meter radius. This challenge aligns closely with the Maximal Covering Location Problem (MCLP), a proven model from operations research used to optimize coverage within a fixed area. The similarities are clear:

1. **Valuable Assets as Demand Points:** Each insured building or asset is treated as a demand point, similar to how MCLP models locations needing coverage.
2. **Fixed 200-Meter Radius:** The regulatory requirement mirrors MCLP’s coverage radius—assets are either within range or not.
3. **Focus on Maximum Exposure:** Just as MCLP aims to maximize covered demand, the fire risk assessment seeks to identify the location with the greatest insured value concentration.

### Algorithm: Maximal Covering Location Problem (MCLP) via Raster Analysis

#### Overview

This algorithm presents a practical and scalable solution for identifying high-risk locations using the Maximal Covering Location Problem (MCLP), implemented through spatial raster analysis. Built on the robust capabilities of the `terra` package in R, the approach transforms complex spatial data into actionable insights. It enables insurers and urban planners to pinpoint areas of highest risk concentration.

#### Key Steps in the Process

**1. Aggregating Spatial Risk Data**
*Objective: Create a simplified, high-level view of demand concentrations.*

- Spatial data points (e.g., insured assets) are converted into a raster grid with 50 x 50 meter cells.
- Each cell summarizes the total insured value within its boundaries, creating a spatial heatmap of risk exposure.

**2. Highlighting High-Risk Clusters**
*Objective: Identify areas where demand values cluster within proximity.*

- Using focal statistics, the algorithm analyzes each cell and its neighbors to highlight broader risk concentrations.
- A weighting matrix refines this analysis, revealing where the most significant clusters lie.

**3. Focusing the Search**
*Objective: Narrow in on the top risk zones.*

- The algorithm selects the top five areas with the highest aggregated risk values.
- Within each, it builds a detailed 20 x 20 grid (400 sub-points) to begin exploring optimal facility or response locations.

**4. Refining to High-Potential Zones**
*Objective: Eliminate low-impact areas and zoom in on promising locations.*

- Only zones that exceed a calculated risk threshold (lower bound) are retained.
- These areas are then mapped at a finer 1-meter resolution, concentrating analysis where it matters most.

**5. Pinpointing Optimal Locations**
*Objective: Identify the exact spots offering maximum coverage potential.*

- A high-resolution grid is generated within the selected areas.
- The algorithm evaluates each point’s ability to cover surrounding demand, producing a shortlist of optimal locations for intervention, resource allocation, or risk management.

#### Illustrative Example: Identifying Areas of Highest Risk Concentration

As a practical application of the above methodology, this example demonstrates how the algorithm can calculate the total demand (e.g., insured value) within a fixed radius around multiple locations.

`find_highest_concentration()` is used to determine the central coordinates of a circle with a constant radius that captures the maximum sum of surrounding demand points. This enables decision-makers—such as insurers or urban planners—to pinpoint where risk is most densely concentrated and where targeted action can yield the greatest impact.

In this case, the method is applied to the Groningen dataset, which contains spatial risk points relevant to the analysis.

**Visualization:**
Display all spatial demand points from the Groningen dataset as a foundation for identifying optimal coverage locations.

```{r example3a, eval = FALSE, message = FALSE, warning = FALSE}
#| fig.alt: >
#| Show points

plot_points(Groningen, value = "amount")

```

```{r, fig.alt = "Map with points in Groningen", echo = FALSE}
# ![](man/figures/example3a-1.png)
knitr::include_graphics("man/figures/example3a-1.png")
```


------------------------------------------------------------------------

**Determine the Optimal Central Location for Maximum Risk Coverage:**
In this step, the goal is to determine the central coordinates of a circular area that captures the highest concentration of demand. Using a fixed radius, the algorithm systematically scans all potential locations, calculating the total risk value within each surrounding area.

To support fast and scalable analysis, all core functions are written in C++. This ensures high-performance computation, allowing the algorithm to handle large and complex spatial datasets with exceptional speed and efficiency.

By applying `find_highest_concentration()`, we isolate the point that offers the greatest cumulative coverage:

```{r, eval = FALSE, echo = TRUE}
#| fig.alt: >
#| Calculate hightest concentration

hc <- find_highest_concentration(Groningen,
value = "amount",
radius = 200)

```

```{r, echo = FALSE, eval = TRUE, message = FALSE, warning = FALSE}
#| fig.alt: >
#| Check time for highest concentration calculation

st <- Sys.time()
hc <- spatialrisk::find_highest_concentration(spatialrisk::Groningen, "amount")
print(Sys.time() - st)

```

```{r}
#| fig.alt: >
#| Show output for the highest concentration

hc[[1]]

```

**Visualizing the Area of Highest Concentration:**
Plot the individual points located within the highest concentration zone. The total sum of their values corresponds to the reported concentration figure.

This area represents the highest demand concentration in the Groningen dataset. It notably includes two apartment buildings containing a large number of insured objects, which significantly contribute to the overall risk exposure.

```{r, echo = TRUE, eval = FALSE}
#| fig.alt: >
#| Show highest concentration

plot(hc)

```

```{r, fig.alt = "Map with highest concentrations", echo = FALSE}
# ![](man/figures/unnamed-chunk-9-1.png)
knitr::include_graphics("man/figures/unnamed-chunk-9-1.png")
```


------------------------------------------------------------------------

It is also possible to display the coordinates for multiple high-concentration areas. To visualize the second and third highest concentrations:

```{r, eval = TRUE, echo = TRUE}
#| fig.alt: >
#| Calculate the first three highest concentrations

hconc <- find_highest_concentration(Groningen,
value = "amount",
radius = 200,
top_n = 3)

```

Create interactive map:

```{r, eval = FALSE, echo = TRUE}
#| fig.alt: >
#| Show highest concentrations on interactive map

plot(hconc)

```

```{r, fig.alt = "Interactive map with highest concentrations", echo = FALSE}
# ![](man/figures/unnamed-chunk-11-1.png)
knitr::include_graphics("man/figures/unnamed-chunk-11-1.png")
```


------------------------------------------------------------------------

Display the objects located within the highest concentration circle:

```{r}
#| fig.alt: >
#| Show objects in the highest circle

hc[[2]]

```

## Example 3: Creating Choropleth Maps

`spatialrisk` also offers features for creating choropleth maps, which can be challenging in R. `points_to_polygon()` provides an elegant solution to this issue.

Typically, creating choropleths involves aggregating data at the regional level of the shapefile and then merging this aggregated data with the shapefile. However, this method can be problematic if the names in the data do not match those in the shapefile, such as when there are differences in punctuation or spelling in area names. To address this, `points_to_polygon()` uses the longitude and latitude of a point to accurately map it to the corresponding region. This approach simplifies the creation of choropleth maps at various regional levels.

For example, `points_to_polygon()` can be used to visualize the total sum insured at the municipality level across the Netherlands:

```{r example4, eval = FALSE, message = FALSE, warning = FALSE}
#| fig.alt: >
#| Map the points to the polygons

gemeente_sf <- points_to_polygon(nl_gemeente, insurance, sum(amount, na.rm = TRUE))

```

`choropleth()` generates a map from the simple feature object created in the previous step. There are two ways to create a choropleth map: if you set the mode to plot, it produces a static map. The clustering in this case is based on the Fisher-Jenks algorithm, a widely used method for classifying data in choropleth maps.

```{r example4b, eval = FALSE, message = FALSE, warning = FALSE}
#| fig.alt: >
#| Show choropleth on municipality level

choropleth(gemeente_sf, mode = "plot", legend_title = "Sum insured (EUR)", n = 5)

```

```{r, fig.alt = "Static choropleth on gemeente level", echo = FALSE}
# ![](man/figures/nl_choro1.png)
knitr::include_graphics("man/figures/nl_choro1.png")
```


If `mode` is set to `view` an interactive map is created:

```{r example4c, eval = FALSE, echo = TRUE}
#| fig.alt: >
#| Show interactive choropleth on municipality level

choropleth(gemeente_sf, mode = "view", legend_title = "Sum insured (EUR)")

```

```{r, fig.alt = "Interactive choropleth on gemeente level", echo = FALSE}
# ![](man/figures/nl_choro2.png)
knitr::include_graphics("man/figures/nl_choro2.png")
```


The following simple feature objects are available in `spatialrisk`: `nl_provincie`, `nl_corop`, `nl_gemeente`, `nl_postcode1`, `nl_postcode2`, `nl_postcode3`, `nl_postcode4`, `world_countries`, and `europe_countries`.