Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/vh-d/RETL
R package for ETL
https://github.com/vh-d/RETL
etl etl-framework transformations
Last synced: 8 days ago
JSON representation
R package for ETL
- Host: GitHub
- URL: https://github.com/vh-d/RETL
- Owner: vh-d
- License: other
- Created: 2019-01-26T15:11:23.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2020-06-27T15:36:26.000Z (over 4 years ago)
- Last Synced: 2024-08-13T07:15:13.406Z (4 months ago)
- Topics: etl, etl-framework, transformations
- Language: R
- Size: 41 KB
- Stars: 2
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.Rmd
- License: LICENSE
Awesome Lists containing this project
- jimsghstars - vh-d/RETL - R package for ETL (R)
README
---
output: github_document
---```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```# RETL
`RETL` is an R package that provides tools for writing ETL jobs in R. It stands on R's wide range of APIs to various types of data sources.
It is intended to be used together with the *[Rflow](https://github.com/vh-d/Rflow)* and *[RETLflow](https://github.com/vh-d/RETLflow)* packages as universal API to data stored in databases, files, excel sheets. RETL relies heavily on the `data.table` package for fast data transofrmations.
## Installation
RETL can be installed from [GitHub](https://github.com/vh-d/RETL) by running:
``` r
devtools::install_github("vh-d/RETL")
```## Examples
``` r
library(RETL)
library(magrittr)# establish connections
my_db <- DBI::dbConnect(RSQLite::SQLite(), "path/to/my.db")
your_csv <- "path/to/your.csv"
your_db <- dbConnect(RMariaDB::MariaDB(), group = "your-db")
```### Pipes
``` r
# simple extract and load
etl_read(from = your_csv) %>% etl_write(to = my_db, name = "customers")# extract -> transform -> load
etl_read(from = my_db, name = "orders") %>% # db query: EXTRACT from a database
dtq(, order_year := year(order_date)) %>% # data.table query: TRANSFORM (adding a new column)
etl_write(to = your_db, name = "customers") # LOAD to a db
```### Other tools
```r
set_index(table = "customers", c("id", "order_year"), your_db)
```