https://github.com/geoarrow/geoarrow-r
Extension types for geospatial data for use with 'Arrow'
https://github.com/geoarrow/geoarrow-r
Last synced: 4 months ago
JSON representation
Extension types for geospatial data for use with 'Arrow'
- Host: GitHub
- URL: https://github.com/geoarrow/geoarrow-r
- Owner: geoarrow
- License: apache-2.0
- Created: 2021-11-25T20:03:47.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2024-06-15T00:47:20.000Z (10 months ago)
- Last Synced: 2024-10-13T12:51:59.365Z (6 months ago)
- Language: R
- Homepage: http://geoarrow.org/geoarrow-r/
- Size: 2.84 MB
- Stars: 153
- Watchers: 7
- Forks: 6
- Open Issues: 11
-
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md
- License: LICENSE.md
Awesome Lists containing this project
- jimsghstars - geoarrow/geoarrow-r - Extension types for geospatial data for use with 'Arrow' (R)
README
---
output: github_document
---```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```# geoarrow
[](https://app.codecov.io/gh/geoarrow/geoarrow-r?branch=main)
The goal of geoarrow is to leverage the features of the [arrow](https://arrow.apache.org/docs/r/) package and larger [Apache Arrow](https://arrow.apache.org/) ecosystem for geospatial data. The geoarrow package provides an R implementation of the [GeoParquet](https://github.com/opengeospatial/geoparquet) file format of and the draft [geoarrow data specification](https://geoarrow.org), defining extension array types for vector geospatial data.
## Installation
You can install the released version of geoarrow from [CRAN](https://cran.r-project.org/) with:
``` r
install.packages("geoarrow")
```You can install the development version of geoarrow from [GitHub](https://github.com/) with:
``` r
# install.packages("pak")
pak::pak("geoarrow/geoarrow-r")
```## Example
The geoarrow package implements conversions to/from various geospatial types (e.g., sf, sfc, s2, wk) with various Arrow representations (e.g., arrow, nanoarrow). The most useful conversions are between the **arrow** and **sf** packages, which in most cases allow sf objects to be passed to **arrow** functions directly after `library(geoarrow)` or `requireNamespace("geoarrow")` has been called.
```{r example}
library(geoarrow)
library(arrow, warn.conflicts = FALSE)
library(sf)nc <- read_sf(system.file("gpkg/nc.gpkg", package = "sf"))
tf <- tempfile(fileext = ".parquet")nc |>
tibble::as_tibble() |>
write_parquet(tf)open_dataset(tf) |>
dplyr::filter(startsWith(NAME, "A")) |>
dplyr::select(NAME, geom) |>
st_as_sf()
```By default, arrow objects are converted to a neutral wrapper around chunked Arrow memory, which in turn implements conversions to most spatial types:
```{r}
df <- read_parquet(tf)
df$geom
st_as_sfc(df$geom)
```The entry point to creating arrays is `as_geoarrow_vctr()`:
```{r}
as_geoarrow_vctr(c("POINT (0 1)", "POINT (2 3)"))
```By default these do not attempt to create a new storage type; however, you can request a storage type or infer one from the data:
```{r}
as_geoarrow_vctr(c("POINT (0 1)", "POINT (2 3)"), schema = geoarrow_native("POINT"))vctr <- as_geoarrow_vctr(c("POINT (0 1)", "POINT (2 3)"))
as_geoarrow_vctr(vctr, schema = infer_geoarrow_schema(vctr))
```There are a number of files to use as examples at that can be read with `arrow::read_ipc_file()`:
```{r}
url <- "https://github.com/geoarrow/geoarrow-data/releases/download/v0.1.0/ns-water-basin_point.arrow"
tab <- read_ipc_file(url, as_data_frame = FALSE)
tab$geometry$type
```