https://github.com/morrowcj/remoteparts

remotePARTS is a set of tools for running Partitioned spatio-temporal auto regression analyses on remotely-sensed data sets.
https://github.com/morrowcj/remoteparts

autocorrelation big-data remote-sensing-in-r statistical-analysis

Last synced: 3 months ago
JSON representation

remotePARTS is a set of tools for running Partitioned spatio-temporal auto regression analyses on remotely-sensed data sets.

Host: GitHub
URL: https://github.com/morrowcj/remoteparts
Owner: morrowcj
License: gpl-3.0
Created: 2020-04-01T16:29:02.000Z (almost 6 years ago)
Default Branch: master
Last Pushed: 2025-07-25T06:42:31.000Z (6 months ago)
Last Synced: 2025-10-22T04:49:25.683Z (3 months ago)
Topics: autocorrelation, big-data, remote-sensing-in-r, statistical-analysis
Language: R
Homepage: https://morrowcj.github.io/remotePARTS/
Size: 156 MB
Stars: 22
Watchers: 3
Forks: 6
Open Issues: 12
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE.md

Awesome Lists containing this project

README

          ---

output: github_document

---

```{r, include = FALSE}

knitr::opts_chunk$set(

  echo = TRUE, collapse = TRUE, comment = "#>",

  fig.path = "man/figures/README-", out.width = "100%"

)

```

## remotePARTS

[![CRAN status](https://www.r-pkg.org/badges/version/remotePARTS)](https://CRAN.R-project.org/package=remotePARTS)

[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-green.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)

[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)

[![R-CMD-check](https://github.com/morrowcj/remotePARTS/workflows/R-CMD-check/badge.svg)](https://github.com/morrowcj/remotePARTS/actions)

[![status](https://joss.theoj.org/papers/c6a3da6a56aa0fb0e1f8a4f36cab12c2/status.svg)](https://joss.theoj.org/papers/c6a3da6a56aa0fb0e1f8a4f36cab12c2)

[![Codecov test coverage](https://codecov.io/gh/morrowcj/remotePARTS/graph/badge.svg)](https://app.codecov.io/gh/morrowcj/remotePARTS)

*remotePARTS* is a software package for the *R* statistical programming language. 

The package contains tools for analyzing spatiotemporal data, typically obtained 

via remote sensing.

## Description

These tools were created to test map-scale hypotheses about trends in large 

remotely sensed data sets, but they are useful for analyzing trends in any 

spatial data, with or without a temporal component. Statistical tests are 

conducted with the PARTS method for analyzing spatially autocorrelated time 

series (Ives et al., 2021). The method's unique approach can handle extremely

large data sets that other spatiotemporal models cannot, while still 

appropriately accounting for autocorrelation structure. This is 

done by partitioning the data into smaller chunks, analyzing chunks separately

and then combining the separate analyses into a single test, that accounts for

correlations among chunks, of the map-scale hypotheses.

## Installation

To install the package and it's dependencies from CRAN, use the following R code:

```{r, eval = FALSE}

install.packages("remotePARTS")

```

To install the latest stable version of this package from github, use

```{r, eval = FALSE}

remotes::install_github("morrowcj/remotePARTS")

```

and to test out the newest features and functionality, use 

```{r, eval = FALSE}

remotes::install_github("morrowcj/remotePARTS", ref = "develop")

```

To ensure the vignette is built when installing from GitHub, use 

```{r, eval = FALSE}

remotes::install_github("morrowcj/remotePARTS", build_vignettes = TRUE)

```

Then, upon successful installation, load the package with 

`library(remotePARTS)`. 

### Dependencies

Since the matrix operations in this package rely on C++ code, as implemented 

via the [RcppEigen package](https://github.com/RcppCore/RcppEigen),

the latest version of [Rtools](https://cran.r-project.org/bin/windows/Rtools/)

is required for Windows and C++11 is required for other systems. 

## Citation

To cite this package in publications, please use:

  Morrow CJ, Ives AR (2025). “remotePARTS: Spatiotemporal autoregression analyses for large data sets.” _Journal of Open Source

  Software_, *10*(109), 7937. doi:10.21105/joss.07937 .

## Contribution, bugs, and feature requests

If you wish to contribute to this package, report bugs, suggest new features, tests,

or behavior, correct typos, update documentation, or anything else, please submit a

[GitHub Issue](https://github.com/morrowcj/remotePARTS/issues). We welcome and

appreciate any and all feedback.

## Testing

To manually run the package's unit tests from a cloned directory, 

after installing dependencies, use

```{r, eval = FALSE, echo = TRUE}

devtools::test()  # run unit tests in tests/testthat/

devtools::check() # full R CMD check (tests + docs)

```

Test coverage is currently low, especially among functions with 

stochastic outcomes. As the package develops, additional tests

will be added.

## Typical Workflow

A typical *remotePARTS* workflow is comprised of two broad steps for analyzing

trends in spatiotemporal datasets: 1) time series analysis and 2) spatial 

analysis. For purely spatial problems, step 1 is skipped. We briefly summarize 

these steps and the expected data structure below.

#### Input data

Currently, *remotePARTS* requires that the data are formatted as "flat" files

(i.e., data frames with 1 row per pixel) with x- and y-coordinates. We will 

not go into detail of how to prepare your data here, as other packages are

dedicated to reading and manipulating spatial data (e.g., see 

`raster::rasterToPoints()`). We recognize that this is a limitation, since flat

files are highly inefficient. Future versions of this package may include

interfaces with raster objects if enough users express interest.

To demonstrate the package's basic functionality, we first simulate a 

small spatiotemporal data set for analysis:

```{r, warning=FALSE, message=FALSE}

library(tibble); library(dplyr); library(tidyr); library(viridisLite)

library(ggplot2); library(remotePARTS)

# set a random seed, for reproducibility

set.seed(42) # don't panic

# simulate a spatiotemporal response variable

sim_spatiotemp <- function(

    n, k, n_time = 4, b.0 = 0, b.x = 0.5, b.y = 1, 

    b.xy = 0.1, sd.xy = 0.2, b.t = 0.2, ar = 0.4,  

    sd.t = 0.1

){

  coords = expand_grid(

    x = seq(0, 1, length.out = n), y = seq(0, 1, length.out = k)

  )

  

  time = seq_len(n_time)

  

  tibble::tibble(

    x = coords[[1]], y = coords[[2]],

    z.0 = b.0 + x*b.x + y*b.y + x*y*b.xy,

    eps = rnorm(n = length(x), mean = 0, sd = sd.xy),

    time.effect = list(time * b.t),

  )  |> 

    rowwise() |>

    mutate(

      z.0 = z.0 + eps,

      sp.innov = list(z.0 + time.effect + rnorm(n_time, sd = sd.t)), 

      z = list(arima.sim(list(ar = ar), n_time, innov = sp.innov))

    ) |> 

    unnest_wider(z, names_sep = ".") |> 

    select(-"time.effect", -"sp.innov", -"eps")

}

```

The function defined above generates a data frame for `n` $\times$ `k` pixels. 

The response variable (`z`) depends upon the `x` and `y` coordinates of the map.

The resulting spatial patterns (`z.0`) are used as the random innovations

of an AR(1) time series model to generate the spatiotemporal response 

(`z.1` -- `z.4`). These data are visualized below:

```{r, fig.asp=0.8, fig.width=5, out.width="50%", cache = TRUE}

# build the data

dat <- sim_spatiotemp(n = 100, k = 100)

# extract coordinates

coords <- select(dat, x, y)

# visualize the data

dat |> 

  pivot_longer(cols = z.1:z.4, names_to = "time", values_to = "z") |> 

  mutate(time = as.numeric(gsub("z\\.", "", time))) |> 

  ggplot(aes(x = x, y = y, fill = z)) + 

  facet_wrap(~time, labeller = "label_both") + 

  geom_tile() + 

  scale_fill_viridis_c(option = "magma") 

```

#### 1. Time series analysis

With properly structured data, the first step is to conduct a time series 

analysis. This is done with the `fitAR_map` function.

```{r, cache = TRUE, message=FALSE, warning=FALSE}

# fit a pixel-wise autoregression model to the full map

AR_fit <- fitAR_map(Y = dat |> select(z.1:z.4) |> as.matrix(), coords = coords)

# combine results into a data frame

df <- data.frame(

  coords = AR_fit$coords, coefs = AR_fit$coefficients, 

  resids = AR_fit$residuals

)

```

This function returns time series regression coefficients and 

residual estimates for each pixel: 

```{r, fig.asp=0.8, fig.width=4, out.width="40%"}

df |> 

  ggplot(aes(x = coords.x, y = coords.y, fill = coefs.t)) +

  geom_tile() +

  labs(x = "x", y = "y", fill = "t coef") +

  scale_fill_viridis_c(option = "magma")

```

```{r, fig.asp=0.8, fig.width=5, out.width="50%"}

df |> 

  pivot_longer(resids.1:resids.4, names_to = "time", values_to = "resids") |> 

  mutate(time = as.numeric(gsub("resids\\.", "", time))) |> 

  ggplot(aes(x = coords.x, y = coords.y, fill = resids)) +

  facet_wrap(~time, labeller = "label_both") +

  geom_tile() +

  labs(x = "x", y = "y", fill = "resid") +

  scale_fill_viridis_c(option = "magma")

```

#### 2. spatial analysis

The second step is to conduct a spatial analysis with `fitGLS_partition`. In 

this case, we'll estimate how the temporal trend differs across the `x` and `y` 

coordinates.  

```{r, cache = TRUE}

# randomly divide the data into partitions

partitions <- sample_partitions(npix = nrow(df), partsize = 1000)

# fit the partitioned GLS

part_GLS <- fitGLS_partition(

  formula = coefs.t ~ coords.x + coords.y + coords.x:coords.y, data = df, 

  partmat = partitions, coord.names = c("coords.x", "coords.y"), ncores = 8

)

```

The results provide coefficient estimates that are corrected for spatial and 

temporal autocorrelation: 

```{r}

part_GLS$overall$t.test

```

Note that these are are not direct estimates of the parameters 

used to generate the data with `sim_spatiotemp` above. For example the 

coefficients for `coords.x` and `coords.y` are estimates of 

`b.t` $\times$ `b.x` and `b.t` $\times$ `b.y`.

##### 2a. Purely spatial problem

This method can also be used for purely spatial problems. Here we will use our

original spatial variable (`z.0`):

```{r, cache = TRUE}

# add the spatial variable into the data frame

df$z.0 <- dat$z.0

# fit the partitioned GLS

part_GLS1 <- fitGLS_partition(

  formula = z.0 ~ coords.x + coords.y + coords.x:coords.y, data = df, 

  partmat = partitions, coord.names = c("coords.x", "coords.y"), ncores = 8

)

```

```{r}

part_GLS1$overall$t.test

```

In this case, the coefficients *are* direct estimates of the spatial parameters 

(`b.0`, `b.x`, `b.y`, `b.xy`) given to `sim_spatiotemp`.

#### Vignette

For detailed examples of how to use `remotePARTS` and all its options, with 

real data, see the `Alaska` vignette:

```{r, eval = FALSE}

vignette("Alaska")

```

The latest stable version of the vignette is also hosted online at 

https://morrowcj.github.io/remotePARTS/Alaska.html.

## References

Ives, Anthony R., et al. "Statistical inference for trends in spatiotemporal data."

Remote Sensing of Environment 266 (2021): 112678. https://doi.org/10.1016/j.rse.2021.112678

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/morrowcj/remoteparts

Awesome Lists containing this project

README