Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/wcjochem/sfarrow
R package for reading/writing `sf` objects from/to parquet files with `arrow`.
https://github.com/wcjochem/sfarrow
Last synced: 3 months ago
JSON representation
R package for reading/writing `sf` objects from/to parquet files with `arrow`.
- Host: GitHub
- URL: https://github.com/wcjochem/sfarrow
- Owner: wcjochem
- License: other
- Created: 2020-11-04T07:47:37.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2022-11-03T20:43:49.000Z (about 2 years ago)
- Last Synced: 2024-07-21T14:24:51.133Z (4 months ago)
- Language: R
- Homepage: https://wcjochem.github.io/sfarrow/
- Size: 992 KB
- Stars: 74
- Watchers: 4
- Forks: 4
- Open Issues: 4
-
Metadata Files:
- Readme: README.Rmd
- License: LICENSE
Awesome Lists containing this project
- jimsghstars - wcjochem/sfarrow - R package for reading/writing `sf` objects from/to parquet files with `arrow`. (R)
README
---
output:
github_document:
html_preview: false
---```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/REAsDME-",
out.width = "100%"
)
```# sfarrow: Read/Write Simple Feature Objects (`sf`) with 'Apache' 'Arrow'
`sfarrow` is a package for reading and writing Parquet and Feather files with
`sf` objects using `arrow` in `R`.Simple features are a popular format for representing spatial vector data using
`data.frames` and a list-like geometry column, implemented in the `R` package
[`sf`](https://r-spatial.github.io/sf/). Apache Parquet files are an
open-source, column-oriented data storage format
([https://parquet.apache.org/](https://parquet.apache.org/)) which enable
efficient read/writing for large files. Parquet files are becoming popular
across programming languages and can be used in `R` using the package
[`arrow`](https://github.com/apache/arrow/).The `sfarrow` implementation translates simple feature data objects using
well-known binary (WKB) format for geometries and reads/writes Parquet/Feather
files. A key goal of the package is for interoperability of the files
(particularly with Python `GeoPandas`), so coordinate reference system
information is maintained in a standard metadata format
([https://github.com/geopandas/geo-arrow-spec](https://github.com/geopandas/geo-arrow-spec)).
Note to users: this metadata format is not yet stable for production uses and
may change in the future.## Installation
`sfarrow` is available through CRAN with:
```{r, eval=FALSE}
install.packages('sfarrow')
```or it can be installed from Github with:
```{r eval=FALSE}
devtools::install_github("wcjochem/sfarrow@main")
```Load the library to begin using it.
```{r}
library(sfarrow)
```### `arrow` package
The installation requires the Arrow library which should be installed with the
`R` package `arrow` dependency. However, some systems may need to follow
additional steps to enable full support of that library. Please refer to the
`arrow`
[documentation](https://CRAN.R-project.org/package=arrow/vignettes/install.html).## Basic usage
Reading Parquet data of spatial files created with Python `GeoPandas`.
```{r}
# load Natural Earth low-res dataset.
# Created in Python with geopandas.to_parquet()
path <- system.file("extdata", "world.parquet", package = "sfarrow")world <- st_read_parquet(path)
world
plot(sf::st_geometry(world))
```Writing `sf` objects to Parquet format files. These Parquet files created with
`sfarrow` can be read within Python using `GeoPandas`.
```{r}
nc <- sf::st_read(system.file("shape/nc.shp", package="sf"), quiet=TRUE)st_write_parquet(obj=nc, dsn=file.path(tempdir(), "nc.parquet"))
# read back into R
nc_p <- st_read_parquet(file.path(tempdir(), "nc.parquet"))nc_p
plot(sf::st_geometry(nc_p))
```For additional examples please see the vignettes.
## Contributions
Contributions, questions, ideas, and issue reports are welcome. Please raise an
issue to discuss or submit a pull request.## Acknowledgements
This work benefited from the work by developers in the GeoPandas, Arrow, and
r-spatial teams. Thank you to the teams for their excellent, open-source work.