https://github.com/shikokuchuo/sakura
Extension to R Serialization
https://github.com/shikokuchuo/sakura
marshalling r serialization
Last synced: about 1 year ago
JSON representation
Extension to R Serialization
- Host: GitHub
- URL: https://github.com/shikokuchuo/sakura
- Owner: shikokuchuo
- License: other
- Created: 2025-01-02T19:29:23.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-04-06T10:03:15.000Z (about 1 year ago)
- Last Synced: 2025-04-06T11:18:49.200Z (about 1 year ago)
- Topics: marshalling, r, serialization
- Language: C
- Homepage: https://shikokuchuo.net/sakura/
- Size: 1.56 MB
- Stars: 16
- Watchers: 1
- Forks: 2
- Open Issues: 2
-
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md
- License: LICENSE
- Code of conduct: .github/CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# sakura
[](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[](https://CRAN.R-project.org/package=sakura)
[](https://github.com/shikokuchuo/sakura/actions)
[](https://app.codecov.io/gh/shikokuchuo/sakura)
```
________
/\ sa \
/ \ ku \
\ / ra /
\/_______/
```
### Extension to R Serialization
Extends the functionality of R serialization by augmenting the built-in reference hook system. This enhanced implementation allows an integrated single-pass operation that combines R serialization with third-party serialization methods.
Facilitates the serialization of even complex R objects, which contain non-system reference objects, such as those accessed via external pointers, to enable their use in parallel and distributed computing.
This package was a request from a meeting of the [R Consortium](https://r-consortium.org/) [Marshalling and Serialization Working Group](https://github.com/RConsortium/marshalling-wg/) held at useR!2024 in Salzburg, Austria. It is designed to eventually provide a common framework for marshalling in R.
It extracts the functionality embedded within the [mirai](https://github.com/shikokuchuo/mirai) async framework for use in other contexts.
### Installation
Install the current release from CRAN:
```{r cran, eval=FALSE}
install.packages("sakura")
```
Or the development version using:
```{r devinstall, eval=FALSE}
pak::pak("shikokuchuo/sakura")
```
### Overview
Some R objects by their nature cannot be serialized, such as those accessed via an external pointer.
Using the [`arrow`](https://arrow.apache.org/docs/r/) package as an example:
```{r arrowfail,error=TRUE}
library(arrow, warn.conflicts = FALSE)
obj <- list(as_arrow_table(iris), as_arrow_table(mtcars))
unserialize(serialize(obj, NULL))
```
In such cases, `sakura::serial_config()` can be used to create custom serialization configurations, specifying functions that hook into R's native serialization mechanism for reference objects ('refhooks').
```{r arrowcfg}
cfg <- sakura::serial_config(
"ArrowTabular",
arrow::write_to_raw,
function(x) arrow::read_ipc_stream(x, as_data_frame = FALSE)
)
```
This configuration can then be supplied as the 'hook' argument for `sakura::serialize()` and `sakura::unserialize()`.
```{r arrowpass}
sakura::unserialize(sakura::serialize(obj, cfg), cfg)
```
This time, the arrow tables are handled seamlessly.
Other types of serialization function are vectorized and in this case, the configuration should be created specifying `vec = TRUE`. Using `torch` as an example:
```{r torchfail, error=TRUE}
library(torch)
x <- list(torch_rand(5L), runif(5L))
unserialize(serialize(x, NULL))
```
Base R serialization above fails, but `sakura` serialization succeeds:
```{r torchpass}
cfg <- sakura::serial_config("torch_tensor", torch::torch_serialize, torch::torch_load, vec = TRUE)
sakura::unserialize(sakura::serialize(x, cfg), cfg)
```
### C Interface
A C level interface is provided. A public header file `sakura.h` is available in `inst/include` for all packages that declare sakura in `LinkingTo`. This may be used in the following way:
```c
#include
sakura_sfunc sakura_serialize;
sakura_ufunc sakura_unserialize;
// runtime initialization:
sakura_serialize = (sakura_sfunc) R_GetCCallable("sakura", "sakura_serialize");
sakura_unserialize = (sakura_ufunc) R_GetCCallable("sakura", "sakura_unserialize");
```
### Acknowledgements
We would like to thank in particular:
- [R Core](https://www.r-project.org/contributors.html) for providing the interface to the R serialization mechanism.
- [Luke Tierney](https://github.com/ltierney/) and [Mike Cheng](https://github.com/coolbutuseless) for their meticulous efforts in documenting the serialization interface.
- [Daniel Falbel](https://github.com/dfalbel) for discussion around an efficient solution to serialization and transmission of torch tensors.
--
Please note that this project is released with a [Contributor Code of Conduct](https://shikokuchuo.net/sakura/CODE_OF_CONDUCT.html). By participating in this project you agree to abide by its terms.