Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jhorzek/tablespan
Create satisficing tables in R.
https://github.com/jhorzek/tablespan
excel html latex r rtf tables
Last synced: 24 days ago
JSON representation
Create satisficing tables in R.
- Host: GitHub
- URL: https://github.com/jhorzek/tablespan
- Owner: jhorzek
- License: gpl-3.0
- Created: 2024-11-04T22:17:23.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2024-11-30T17:30:52.000Z (27 days ago)
- Last Synced: 2024-11-30T18:29:22.502Z (27 days ago)
- Topics: excel, html, latex, r, rtf, tables
- Language: R
- Homepage: https://jhorzek.github.io/tablespan/
- Size: 3.22 MB
- Stars: 8
- Watchers: 1
- Forks: 0
- Open Issues: 3
-
Metadata Files:
- Readme: README.Rmd
- License: LICENSE.md
Awesome Lists containing this project
- jimsghstars - jhorzek/tablespan - Create satisficing tables in R. (R)
README
---
output: github_document
---```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
library(dplyr)
library(tablespan)
```# tablespan
> Create satisficing tables in R the formula way.
The objective of `tablespan` is to provide a "good enough" approach to creating tables by
leveraging R's formulas.`tablespan` builds on the awesome packages [`openxlsx`](https://ycphs.github.io/openxlsx/) and [`gt`](https://gt.rstudio.com/), which allows tables created with `tablespan`
to be exported to the following formats:1. **Excel** (using [`openxlsx`](https://ycphs.github.io/openxlsx/))
2. **HTML** (using [`gt`](https://gt.rstudio.com/))
3. **LaTeX** (using [`gt`](https://gt.rstudio.com/))
4. **RTF** (using [`gt`](https://gt.rstudio.com/))## Installation
`tablespan` is not yet available from CRAN. To install from GitHub, run the following
lines:```{r, eval=FALSE}
library(remotes)
remotes::install_github("jhorzek/tablespan")
```## Introduction
R has a large set of great packages that allow you to create and export tables
that look exactly like you envisioned. However, sometimes you may just need a
good-enough table that is easy to create and share with others. This is where
`tablespan` can be of help.Let's assume that we want to share the following table:
```{r}
library(dplyr)
data("mtcars")summarized_table <- mtcars |>
group_by(cyl, vs) |>
summarise(N = n(),
mean_hp = mean(hp),
sd_hp = sd(hp),
mean_wt = mean(wt),
sd_wt = sd(wt))print(summarized_table)
```We don't want to share the table as is - the variable names are all a bit technical
and the table could need some spanners summarizing columns. So, we want
to share a table that looks something like this:```
| | Horse Power | Weight |
| Cylinder | Engine | Mean | SD | Mean | SD |
| -------- | ------ | ----- | --- | ---- | -- |
| | |
````tablespan` allows us to create this table with a single formula.
### Creating a Basic Table
In `tablespan`, the table headers are defined with a formula. For example,
`cyl ~ mean_hp + sd_hp` defines a table with `cyl` as the
row names and `mean_hp` and `sd_hp` as columns:```{r}
library(tablespan)
tablespan(data = summarized_table,
formula = cyl ~ mean_hp + sd_hp)
```Note that the row names (`cyl`) are in a separate block to the left.
### Adding Spanners
Spanners are defined using braces and spanner names. For example, the
following defines a spanner for `mean_hp` and `sd_hp` with the name `Horsepower`:
`cyl ~ (Horsepower = mean_hp + sd_hp)`:```{r}
tablespan(data = summarized_table,
formula = cyl ~ (Horsepower = mean_hp + sd_hp))
```Spanners can also be nested:
```{r}
tablespan(data = summarized_table,
formula = cyl ~ (Horsepower = (Mean = mean_hp) + (SD = sd_hp)))
```### Renaming Columns
Variable names in an R `data.frame` are often very technical (e.g., `mean_hp` and `sd_hp`).
When sharing the table, we may want to replace those names. In the example above,
we may want to replace `mean_hp` and `sd_hp` with "Mean" and "SD". In
`tablespan` renaming variables is achieved with `new_name:old_name`.
For example, `cyl ~ (Horsepower = Mean:mean_hp + SD:sd_hp)` renames `mean_hp` to
`Mean` and `sd_hp` to `SD`:```{r}
tablespan(data = summarized_table,
formula = cyl ~ (Horsepower = Mean:mean_hp + SD:sd_hp))
```### Creating the Full Table
The combination of row names, spanners, and renaming of variables allows creating
the full table:```{r}
library(dplyr)
library(tablespan)
data("mtcars")summarized_table <- mtcars |>
group_by(cyl, vs) |>
summarise(N = n(),
mean_hp = mean(hp),
sd_hp = sd(hp),
mean_wt = mean(wt),
sd_wt = sd(wt))tbl <- tablespan(data = summarized_table,
formula = Cylinder:cyl + Engine:vs ~
N +
(`Horse Power` = Mean:mean_hp + SD:sd_hp) +
(`Weight` = Mean:mean_wt + SD:sd_wt),
title = "Motor Trend Car Road Tests",
subtitle = "A table created with tablespan",
footnote = "Data from the infamous mtcars data set.")
tbl
```## Exporting to Excel
Tables created with `tablespan` can now be translated to xlsx tables with [`openxlsx`](https://ycphs.github.io/openxlsx/) using the `as_excel` function:
```{r}
# as_excel creates an openxlsx workbook
wb <- as_excel(tbl = tbl)# Save the workbook as an xlsx file:
# openxlsx::saveWorkbook(wb,
# file = "cars.xlsx",
# overwrite = TRUE)
```![](man/figures/tablespan_example_cars.png)
### Styling
While `tablespan` provides limited styling options, some elements can be adjusted.
For example, we may want to print some elements in bold or format numbers differently.
In `tablespan`, styling happens when translating the table to an `openxlsx` workbook
with `as_excel`.
To this end, `tablespan` provides a `styles` argument.#### Formatting Cells
Let's assume we want all `mean_hp` values with a value $\geq 100$ to be printed
in bold. To this end, we first create a new style object using `openxlsx`:```{r}
bold <- openxlsx::createStyle(textDecoration = "bold")
```Next, we create a cell style with `tablespan`:
```{r}
hp_ge_100 <- cell_style(rows = which(summarized_table$mean_hp >= 100),
colnames = "mean_hp",
style = bold,
gridExpand = FALSE)
```
Note that we specify the indices of the rows that we want to be in bold and the
column name of the item.Finally, we pass this style as part of a list to `as_excel`:
```{r}
# as_excel creates an openxlsx workbook
wb <- as_excel(tbl = tbl,
styles = tbl_styles(cell_styles = list(hp_ge_100)))# Save the workbook as an xlsx file:
# openxlsx::saveWorkbook(wb,
# file = "cars.xlsx",
# overwrite = TRUE)
```![](man/figures/tablespan_example_cars_styled.png)
#### Formatting Data Types
`tablespan` also allows formatting specific data types. Let's assume that we want
to round all doubles to 3 instead of the default 2 digits.
To this end, we use the `create_data_styles` function, where we specify (1) a
function that checks for the data type we want to style (here `is.double`) and
(2) a style for all columns that match that style:
```{r}
double_style <- create_data_styles(double = list(test = is.double,
style = openxlsx::createStyle(numFmt = "0.000")))
wb <- as_excel(tbl = tbl, styles = tbl_styles(data_styles = double_style))# Save the workbook as an xlsx file:
# openxlsx::saveWorkbook(wb,
# file = "cars.xlsx",
# overwrite = TRUE)
```![](man/figures/tablespan_example_cars_styled_data.png)
## Exporting to HTML, LaTeX, and RTF
Tables created with `tablespan` can also be exported to `gt` which allows saving as HTML, LaTeX, or RTF file. To this end, we simply have to call `as_gt` on our table:
```{r, include=FALSE}
# Translate to gt:
gt_tbl <- as_gt(tbl = tbl)
```
```{r, eval=FALSE}
# Translate to gt:
gt_tbl <- as_gt(tbl = tbl)
gt_tbl
```
### Styling Great Tables
The `gt` package provides a wide range of functions to adapt the style of the
table created with `as_gt`. For instance, `opt_stylize` adds a pre-defined style
to the entire table:```{r, eval=FALSE}
gt_tbl |>
gt::opt_stylize(style = 6,
color = 'gray')
```
When adapting the `gt` object, there is an important detail to keep in mind: To
ensure that each table spanner has a unique ID, `tablespan` will create IDs that
differ from the text shown in the spanner. To demonstrate this, Let's assume
that we want to add a spanner above `Horse Power` and `Weight`:
```{r, error=TRUE}
gt_tbl |>
gt::tab_spanner(label = "New Spanner",
spanners = c("Horse Power", "Weight"))
```This will throw an error because the spanner IDs are different from the spanner labels.
To get the spanner IDs, use `gt::tab_info()`:```{r, eval = FALSE}
gt_tbl |>
gt::tab_info()
```
The IDs for the spanners can be found at the very bottom. To add another spanner
above `Horse Power` and `Weight`, we have to use these IDs:
```{r, eval=FALSE}
gt_tbl |>
gt::tab_spanner(label = "New Spanner",
spanners = c("__BASE_LEVEL__Horse Power",
"__BASE_LEVEL__Weight"))
```
## Tables without row names
Using `1` on the left hand side of the formula creates a table without row names.
For example, `1 ~ (Horsepower = Mean:mean_hp + SD:sd_hp)` defines```{r}
tablespan(data = summarized_table,
formula = 1 ~ (Horsepower = Mean:mean_hp + SD:sd_hp))
```## References
- gt: Iannone R, Cheng J, Schloerke B, Hughes E, Lauer A, Seo J, Brevoort K, Roy O (2024). gt: Easily Create Presentation-Ready Display Tables. R package version 0.11.1.9000, , .
- expss: Gregory D et al. (2024). expss: Tables with Labels in R. R package version 0.9.31, .
- tables: Murdoch D (2024). tables: Formula-Driven Table Generation. R package version 0.9.31, .
- openxlsx: Schauberger P, Walker A (2023). _openxlsx: Read, Write and Edit xlsx Files_. R package version 4.2.5.2,
.