Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mapme-initiative/mapme.biodiversity
Efficient analysis of spatial biodiversity datasets for global portfolios
https://github.com/mapme-initiative/mapme.biodiversity
environment eo gis mapme spatial sustainability
Last synced: 3 months ago
JSON representation
Efficient analysis of spatial biodiversity datasets for global portfolios
- Host: GitHub
- URL: https://github.com/mapme-initiative/mapme.biodiversity
- Owner: mapme-initiative
- License: gpl-3.0
- Created: 2022-02-09T09:14:58.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-06-11T11:50:36.000Z (5 months ago)
- Last Synced: 2024-06-11T20:17:10.755Z (5 months ago)
- Topics: environment, eo, gis, mapme, spatial, sustainability
- Language: R
- Homepage: https://mapme-initiative.github.io/mapme.biodiversity/
- Size: 41.3 MB
- Stars: 24
- Watchers: 4
- Forks: 7
- Open Issues: 10
-
Metadata Files:
- Readme: README.Rmd
- License: LICENSE.md
Awesome Lists containing this project
- open-sustainable-technology - mapme.biodiversity - Efficient analysis of spatial biodiversity datasets for global portfolios. (Biosphere / Biodiversity Analysis and Metrics)
- jimsghstars - mapme-initiative/mapme.biodiversity - Efficient analysis of spatial biodiversity datasets for global portfolios (R)
README
---
output: github_document
---[![R-CMD-check](https://github.com/mapme-initiative/mapme.biodiversity/actions/workflows/R-CMD-check.yaml/badge.svg?branch=main)](https://github.com/mapme-initiative/mapme.biodiversity/actions)
[![Coverage Status](https://img.shields.io/codecov/c/github/mapme-initiative/mapme.biodiversity/master.svg)](https://app.codecov.io/github/mapme-initiative/mapme.biodiversity?branch=main)
[![CRAN status](https://badges.cranchecks.info/worst/mapme.biodiversity.svg)](https://cran.r-project.org/web/checks/check_results_mapme.biodiversity.html)
[![CRAN version](https://www.r-pkg.org/badges/version/mapme.biodiversity)](https://CRAN.R-project.org/package=mapme.biodiversity)
[![License](https://img.shields.io/badge/License-GPL (>=3)-brightgreen.svg?style=flat)](https://choosealicense.com/licenses/gpl-3.0/)# mapme.biodiversity
## About
Biodiversity areas, especially primary forests, provide multiple
ecosystem services for the local population and the planet as a whole.
The rapid expansion of human land use into natural ecosystems and the
impacts of the global climate crisis put natural ecosystems and the
global biodiversity under threat.The mapme.biodiversity package helps to analyse a number of biodiversity
related indicators and biodiversity threats based on freely available
geodata-sources such as the Global Forest Watch. It supports
computational efficient routines and heavy parallel computing in
cloud-infrastructures such as AWS or Microsoft Azure using in the statistical
programming language R. The package allows for the analysis of global
biodiversity portfolios with a thousand or millions of AOIs which is
normally only possible on dedicated platforms such as the Google Earth
Engine. It provides the possibility to e.g. analyse the World Database
of Protected Areas (WDPA) for a number of relevant indicators. The
primary use case of this package is to support scientific analysis and
data science for individuals and organizations who seek to preserve the
planet biodiversity. Its development is funded by the German
Development Bank KfW.## Installation
The package and its dependencies can be installed from CRAN via:
```{r install-cran, eval = FALSE}
install.packages("mapme.biodiversity", dependencies = TRUE)
```To install the development version, use the following command:
```{r install-devel, eval = FALSE}
remotes::install_github("https://github.com/mapme-initiative/mapme.biodiversity", dependencies = TRUE)
```## Available resources and indicators
Below is a list of the resources currently supported by `mapme.biodiversity`.
```{r available_resources, echo = FALSE}
knitr::kable(mapme.biodiversity::available_resources()[, c("name", "description", "licence")])
```Next, is a list of supported indicators.
```{r available_indicators, echo = FALSE}
knitr::kable(mapme.biodiversity::available_indicators()[, c("name", "description")])
```## Usage example
`{mapme.biodiversity}` works by constructing a portfolio from an sf
object. Specific raster and vector resource matching the spatio-temporal
extent of the portfolio are made available locally. Once all required
resources are available, indicators can be calculated individually for
each asset in the portfolio.```{r resources}
library(mapme.biodiversity)
library(sf)
```Once you have decided on an indicator you are interested in, you can
start by making the required resource available for your portfolio.
Using `mapme_options()` you can set an output directory, control the maximum
size of polygons before they are chunked into smaller parts, and control the
verbosity of the package.A portfolio is represented by an sf-object. It is required for the object to
only contain geometries of type `POLYGON` and `MULTIPOLYGON` as assets.
We can request the download of a resource for the spatial extent of our portfolio
by using the `get_resources()` function. We simply supply our portfolio and one
or more resource functions. Once the resources were made available, we can query
the calculation of an indicator by using the `calc_indicators()` function. This
function also expects the portfolio as input and one or more indicator functions.
Once the indicator has been calculated for all assets in a portfolio, the data
is returned as a nested list column to the original portfolio object. The
output of each indicator is standardized to common format, consisting of a
tibble with columns `datetime`, `variable`, `unit`, and `value`. We can
transform the the data to long format by using `portfolio_long()`.```{r calculation}
mapme_options(
outdir = system.file("res", package = "mapme.biodiversity"),
chunk_size = 1e6, # in ha
verbose = FALSE
)aoi <- system.file("extdata", "sierra_de_neiba_478140_2.gpkg", package = "mapme.biodiversity") %>%
sf::read_sf() %>%
get_resources(
get_gfw_treecover(version = "GFC-2023-v1.11"),
get_gfw_lossyear(version = "GFC-2023-v1.11"),
get_gfw_emissions()
) %>%
calc_indicators(calc_treecover_area_and_emissions(years = 2016:2017, min_size = 1, min_cover = 30)) %>%
portfolio_long()aoi
```## Using cloud storages
`{mapme.biodiversity}` leverages GDAL's capabilities for data I/O. For users of
this package, that means that integrating a cloud storage is as easy as setting
up a configuration file and changing the `outdir` argument in `mapme_options()`.
While you could also decide to use environment variables, we recommend to set
up a GDAL config file. You can find GDAL's documentation on this topic [here](https://gdal.org/user/configoptions.html#gdal-configuration-file).Suppose that we want to use an AWS S3 bucket that we control to
write resource data to. Let's assume this bucket is already set up and we wish
to refer to it in our R code as `mapme-data`. The GDAL configuration file
should look something like this:```{ini}
[credentials][.mapme-data]
path=/vsis3/mapme-data
AWS_SECRET_ACCESS_KEY=
AWS_ACCESS_KEY_ID=
```The connection will be handled based on GDAL's virtual file system. You can
find documentation on specific options for your cloud provider [here](https://gdal.org/user/virtual_file_systems.html#network-based-file-systems).Ideally, you would also set the following in the `.Renviron` file in your user's
home directory to ensure that GDAL is aware of this configuration when an R
session is started:```{ini}
GDAL_CONFIG_FILE = ""
```Then, in your scripts set the `outdir` option to the value specified with
the `path` variable in the configuration file:```{r s3-outdir, eval = FALSE}
mapme_options(outdir = "/vsis3/mapme-data")
```## A note on parallel computing
`{mapme.biodiversity}` follows the parallel computing paradigm of the
[`{future}`](https://cran.r-project.org/package=future) package. That means
that you as a user are in the control if and how you would like to set up
parallel processing. Since `{mapme.biodiversity} v0.7` supports
parallel processing on a nested-level. The outer level applies parallel processing
to the assets in a portfolio, the inner level to potential chunks for polygons
that are larger than the specified chunks size or the single components
of assets of type `MULTIPOLYGON`.The maximum chunk size is specified in hectares via `mapme_options()` and defaults
to 100,000 ha. Polygons larger than this threshold will be split into chunks
of roughly the same size.Fine-control of parallel processing is given by using future topologies
(find more information [here](https://cran.r-project.org/package=future/vignettes/future-3-topologies.html)).
To process all assets sequentially, but allow to spawn up to 4 workers to
process chunks in parallel you might specify:```{r parallel-1, eval = FALSE}
library(future)
plan(list(sequential, tweak(cluster, workers = 6)))
```As another example, with the code below one would apply parallel processing of 2
assets, with each having 4 workers available to process chunks, thus requiring
a total of 8 available cores on the host machine. Be sure to not request
more workers than available on your machine.```{r parallel, eval = FALSE}
library(progressr)plan(list(tweak(cluster, workers = 2), tweak(cluster, workers = 4)))
with_progress({
aoi <- calc_indicators(
aoi,
calc_treecover_area_and_emissions(
min_size = 1,
min_cover = 30
)
)
})plan(sequential) # close child processes
```Head over to the [online
documentation](https://mapme-initiative.github.io/mapme.biodiversity/index.html)
find more detailed information about the package.