https://github.com/thinkr-open/mongooser
A port of MongooseJS to R
https://github.com/thinkr-open/mongooser
Last synced: 11 months ago
JSON representation
A port of MongooseJS to R
- Host: GitHub
- URL: https://github.com/thinkr-open/mongooser
- Owner: ThinkR-open
- License: other
- Created: 2022-03-11T09:37:39.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2023-06-20T20:38:54.000Z (about 3 years ago)
- Last Synced: 2025-04-19T19:08:23.290Z (about 1 year ago)
- Language: R
- Size: 27.3 KB
- Stars: 5
- Watchers: 5
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.Rmd
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%",
eval = TRUE,
error = TRUE
)
```
# mongooser
[WIP / DO NOT USE IN PRODUCTION]
The goal of mongooser is to port [mongoosejs](https://mongoosejs.com/) to R, for a more, predictable use of MongoDB in a production context.
## Installation
You can install the development version of mongooser like so:
``` r
pak::pak("thinkr-open/mongooser")
```
### What's wrong with `{mongolite}`?
Nothing, `{mongooser}` is actually built on top of it.
### Ok, so why?
An issue with MongoDB is that it doesn't enforce any kind of schema, and R isn't very well equiped to handle unstructured data.
Also, given that it's unstructured and does the type conversion for you, you can get unexpected results when using MongoDB with R.
And on top of that, human tend to be a little bit lazy, and if the db allows them to insert random unstructured elements in the db, well, they'll do it, and then some data engineer will cry and scream to scrap and restructure the data to be usable.
Here is a simple example:
```{r eval = TRUE, error = TRUE}
system("docker run -p 27017:27017 -d mongo")
Sys.sleep(10)
```
```{r include=FALSE}
mongolite::mongo()$drop()
```
```{r}
fridge <- mongolite::mongo()
monday_meal <- list(
day = Sys.Date() + 1,
ingredient = c("tofu", "brocoli"),
to_reheat = TRUE,
box = "blue"
)
fridge$insert(monday_meal)
```
```{r}
class(monday_meal)
fridge_find <- fridge$find()
class(fridge_find)
fridge_iterate <- fridge$iterate()$one()
class(fridge_iterate)
```
It's even more intersting if you look at the detail structure of each element:
```{r}
dplyr::glimpse(monday_meal)
dplyr::glimpse(fridge_find)
dplyr::glimpse(fridge_iterate)
```
Let's insert another list:
```{r}
tuesday_meal <- list(
day = Sys.Date() + 2,
ingredient = "pasta",
box = "red"
)
fridge$insert(tuesday_meal)
```
Depending on what you query, you won't get the same structure:
```{r eval = TRUE, echo = TRUE}
dplyr::glimpse(
fridge$iterate('{"box": "blue"}')$one()
)
```
```{r eval = TRUE, echo = TRUE}
dplyr::glimpse(
fridge$iterate('{"box": "red"}')$one()
)
```
As MongoDB & `{mongolite}` doesn't inforce any kind of type on read/write, it can be hard to rely on the output.
`{mongooser}` tries to solve that issue by porting [mongoosejs](https://mongoosejs.com/), an object modeling tooling for MongoDB and NodeJS, to R.
## Here is an example
```
docker run -d -p 2811:27017 mongo:3.4
```
The comments are the correponding mongoosejs codes
```{r eval = TRUE}
# const mongoose = require('mongoose');
library(mongooser)
# mongoose.connect("mongodb://127.0.0.1:27017/test")
mongooser_connect(
db = "test",
url = "mongodb://localhost",
verbose = FALSE,
options = mongolite::ssl_options()
)
```
We'll then build a model, which is a schema for our data:
```{r}
# const Cat = mongoose.model('Cat', { name: String });
Food <- model(
"Food",
properties = list(
day = \(x) structure(NA_real_, class = "Date"),
ingredient = character,
box = character,
to_reheat = logical
),
validator = list(
day = lubridate::ymd,
ingredient = as.character,
box = as.character,
to_reheat = as.logical
)
)
```
Let's empty our collection first:
```{r eval = TRUE}
Food$drop()
```
Here, properties is a list of function that returns a data type.
It should be one of base R data type, or a function that returns a datatype.
If you want to build your own properties, it should be a function that takes one argument (the lenght) and return a repetition of that data type.
You can then create an instance of that model and save it:
```{r eval = TRUE}
monday <- Food$new(
monday_meal
)
monday$save()
tuesday <- Food$new(
tuesday_meal
)
tuesday$save()
```
Then, you can query the model:
```{r eval = TRUE, echo = TRUE}
dplyr::glimpse(
Food$find()
)
```
List the number of items :
```{r eval = TRUE, echo = TRUE}
Food$count_documents()
```
Query one item with `find_one()`.
Note that this will return the first element matching the query, that does not mean that there are no other records matching this query:
```{r}
Food$find_one()
```