Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/hrbrmstr/sergeant-caffeinated
:guardsman: ☕️ Tools to Transform and Query Data with 'Apache' 'Drill'
https://github.com/hrbrmstr/sergeant-caffeinated
dplyr drill jdbc parquet-files r rstats sql
Last synced: about 1 month ago
JSON representation
:guardsman: ☕️ Tools to Transform and Query Data with 'Apache' 'Drill'
- Host: GitHub
- URL: https://github.com/hrbrmstr/sergeant-caffeinated
- Owner: hrbrmstr
- License: other
- Created: 2018-10-14T12:28:44.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2020-11-21T01:32:41.000Z (almost 4 years ago)
- Last Synced: 2024-08-06T03:04:31.296Z (3 months ago)
- Topics: dplyr, drill, jdbc, parquet-files, r, rstats, sql
- Language: R
- Homepage:
- Size: 179 KB
- Stars: 7
- Watchers: 4
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.Rmd
- License: LICENSE
Awesome Lists containing this project
README
---
output: rmarkdown::github_document
editor_options:
chunk_output_type: console
---
```{r pkg-knitr-opts, include=FALSE}
hrbrpkghelpr::global_opts()
``````{r badges, results='asis', echo=FALSE, cache=FALSE}
hrbrpkghelpr::stinking_badges()
``````{r description, results='asis', echo=FALSE, cache=FALSE}
hrbrpkghelpr::yank_title_and_description()
```## What's Inside The Tin
The following functions are implemented:
```{r ingredients, results='asis', echo=FALSE, cache=FALSE}
hrbrpkghelpr::describe_ingredients()
```## Installation
```{r install-ex, results='asis', eval=TRUE, echo=TRUE, cache=FALSE}
remotes::install_github("hrbrmstr/sergeant-caffeinated")
```## Usage
```{r lib-ex}
library(sergeant.caffeinated)# current version
packageVersion("sergeant.caffeinated")```
```{r dplyr-01, message=FALSE}
library(tidyverse)# use localhost if running standalone on same system otherwise the host or IP of your Drill server
test_host <- Sys.getenv("DRILL_TEST_HOST", "localhost")be_quiet()
con <- dbConnect(drv = DrillJDBC(), sprintf("jdbc:drill:zk=%s", test_host))
db <- tbl(con, "cp.`employee.json`")
# without `collect()`:
db %>%
count(
gender,
marital_status
)db %>%
count(
gender,
marital_status
) %>%
collect()db %>%
group_by(position_title) %>%
count(gender) -> tmp2group_by(db, position_title) %>%
count(gender) %>%
ungroup() %>%
mutate(
full_desc = ifelse(gender=="F", "Female", "Male")
) %>%
collect() %>%
select(
Title = position_title,
Gender = full_desc,
Count = n
)arrange(db, desc(employee_id)) %>% print(n=20)
db %>%
mutate(
position_title = tolower(position_title),
salary = as.numeric(salary),
gender = ifelse(gender == "F", "Female", "Male"),
marital_status = ifelse(marital_status == "S", "Single", "Married")
) %>%
group_by(supervisor_id) %>%
summarise(
underlings_count = n()
) %>%
collect()
```## sergeant Metrics
```{r echo=FALSE}
cloc::cloc_pkg_md()
```## Code of Conduct
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.