Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ColinFay/odds
On Disk Data Storage for Cross-Session Access in R
https://github.com/ColinFay/odds
Last synced: 8 days ago
JSON representation
On Disk Data Storage for Cross-Session Access in R
- Host: GitHub
- URL: https://github.com/ColinFay/odds
- Owner: ColinFay
- License: other
- Archived: true
- Created: 2020-02-07T20:50:52.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2020-02-09T19:50:49.000Z (almost 5 years ago)
- Last Synced: 2024-08-13T07:15:11.189Z (4 months ago)
- Language: R
- Homepage:
- Size: 19.5 KB
- Stars: 16
- Watchers: 5
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.Rmd
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
- jimsghstars - ColinFay/odds - On Disk Data Storage for Cross-Session Access in R (R)
README
---
output: github_document
---```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```# odds
The goal of `{odds}` is to provide an on-disk data-storage of native R object, for cross-session data access.
## Installation
You can install the dev version of `{odds}` from GitHub with:
``` r
remotes::install_github("colinfay/odds")
```## Basic use
The main goal of `{odds}` is to create a data storage architecture, on disk, so that you can access the values from one session to another.
### How it works
By default, the storage is done at `~/.odds`, but it can be changed when creating the storage object.
__Note that the path is passed through `fs::path_norm()`, which doesn't treat `~` the same way as base R on Windows.__
```{r}
library(odds)
st <- Storage$new()
```There are two main methods: `set()` and `get()`.
The first saves a value under a name on the disk, the second retrieve this value from the storage.```{r}
st$set(head(iris), "a")
st$get("a")
```Storages can be namespaced, and the default is "global",
```{r}
nsp <- paste(sample(letters, 3), collapse = "")st$set(mtcars, "a", namespace = nsp)
st$get("a", namespace = nsp)
```### Cross session access
Let's create an object in another R session:
```{r}
library(callr)
rx <- r_bg(
function(){
library(odds)
st <- Storage$new()
st$set(head(airquality), "ping", namespace = "blop")
}
)
``````{r include = FALSE}
rx$wait()
```It's now accessible in the first session:
```{r}
st$get("ping", namespace = "blop")
```Values can be deleted:
```{r}
st$rm("ping", namespace = "blop")
```
Namespaces can be deleted:```{r}
st$remove_namespace(nsp)
st$remove_namespace("blop")
```## Overhead
Of course, reading from disk adds some overhead, but for small to medium size objects, the cost of `get`ting from disk instead of reading for RAM is pretty small.
```{r}
library(dplyr, warn.conflicts = FALSE)
library(ggplot2)
st$set(diamonds, "dm", "bench")
bench::mark(
ram = {
diamonds %>% filter(cut == "Ideal")
},
disk = {
st$get("dm", "bench") %>%
filter(cut == "Ideal")
}
)
``````{r include = FALSE}
st$remove_namespace("bench")
````set()` and `get()` are powered by `{qs}` `qread()` and `qwrite()` and take the same arguments, so you can use parameters to these functions to speed up the read and write timing.
Read the `{qs}` benchmark [online](https://github.com/traversc/qs#summary-table).
## Acknowledgment
This package heavily relies on the `{qs}` package.
Thanks to the package authors for their work.## Coc
Please note that the 'odds' project is released with a
[Contributor Code of Conduct](CODE_OF_CONDUCT.md).
By contributing to this project, you agree to abide by its terms.