https://github.com/antoniojbt/episcout
Facilitates cleaning, exploring and visualising large epidemiological datasets.
https://github.com/antoniojbt/episcout
epidemiology r wrangling
Last synced: 3 months ago
JSON representation
Facilitates cleaning, exploring and visualising large epidemiological datasets.
- Host: GitHub
- URL: https://github.com/antoniojbt/episcout
- Owner: antoniojbt
- License: gpl-3.0
- Created: 2018-09-28T15:10:39.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2025-08-25T18:10:40.000Z (10 months ago)
- Last Synced: 2025-08-25T19:07:50.990Z (10 months ago)
- Topics: epidemiology, r, wrangling
- Language: R
- Homepage:
- Size: 1.16 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 12
-
Metadata Files:
- Readme: README.md
- License: COPYING
Awesome Lists containing this project
README
[](https://www.repostatus.org/#active)
[](https://github.com/antoniojbt/episcout/actions/workflows/r-cmd-check.yml)
[](https://codecov.io/gh/AntonioJBT/episcout)
# episcout
episcout provides helper functions for cleaning, exploring and visualising large datasets. It wraps common preprocessing and descriptive tasks so you can focus on analysis. The package builds on the **tidyverse** and **data.table** ecosystems for fast and flexible data manipulation.
## Features
* **Cleaning** – `epi_clean_*` functions tidy raw data and detect issues such as duplicates or inconsistent labels.
* **Statistics** – `epi_stats_*` functions create summary tables and descriptive statistics in a single call.
* **Plotting** – `epi_plot_*` wrappers make it straightforward to produce common graphs with *ggplot2* and *cowplot*.
* **Utilities** – `epi_utils_*` helpers cover tasks like parallel processing and logging.
## Installation
Install from GitHub:
``` r
install.packages("devtools")
library(devtools)
install_github("AntonioJBT/episcout")
```
## Getting Started
Functions are grouped by purpose, e.g.:
epi_clean_* for data wrangling/cleanup.
epi_stats_* for generating descriptive statistics and contingency tables.
epi_plot_* for plotting (wrappers around ggplot2 and cowplot).
epi_utils_* for utilities such as parallel processing, logging, etc.
Miscellaneous helpers such as epi_read/epi_write.
## Example
This is a basic example of things you can do with episcout:
``` r
library(episcout)
# A data frame:
n <- 20
df <- data.frame(var_id = rep(1:(n / 2), each = 2),
var_to_rep = rep(c('Pre', 'Post'), n / 2),
x = rnorm(n),
y = rbinom(n, 1, 0.50),
z = rpois(n, 2)
)
# Print the first few rows and last few rows:
dim(df)
epi_head_and_tail(df, rows = 2, cols = 2)
epi_head_and_tail(df, rows = 2, cols = 2, last_cols = TRUE)
# Get all duplicates:
check_dups <- epi_clean_get_dups(df, 'var_id', 1)
dim(check_dups)
check_dups
# Get summary descriptive statistics for numeric/integer column:
num_vec <- df$x
desc_stats <- epi_stats_numeric(num_vec)
class(desc_stats)
lapply(desc_stats, class)
desc_stats
# And many more functions for cleaning, stats and plotting that do things a bit faster or more conveniently and I couldn't easily find in other packages.
```
## Contribute
- [Issue Tracker](https://github.com/AntonioJBT/episcout/issues)
- Pull requests welcome!
Support
-------
If you have any issues, pull requests, etc. please report them in the issue tracker.
## News
- Version 0.1.3
Improved coverage tests, added a few wrappers, slightly improved documentation
- Version 0.1.2
Minor bug fixes and internal improvements
- Version 0.1.1
First release