Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/public-health-scotland/slfhelper
An R package for working with the SLFs
https://github.com/public-health-scotland/slfhelper
r r-package
Last synced: 5 days ago
JSON representation
An R package for working with the SLFs
- Host: GitHub
- URL: https://github.com/public-health-scotland/slfhelper
- Owner: Public-Health-Scotland
- License: other
- Created: 2019-05-30T15:29:20.000Z (over 5 years ago)
- Default Branch: production
- Last Pushed: 2024-09-17T13:58:10.000Z (about 2 months ago)
- Last Synced: 2024-09-17T16:54:53.422Z (about 2 months ago)
- Topics: r, r-package
- Language: R
- Homepage: https://public-health-scotland.github.io/slfhelper/
- Size: 6.13 MB
- Stars: 6
- Watchers: 2
- Forks: 1
- Open Issues: 10
-
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE
- Code of conduct: .github/CODE_OF_CONDUCT.md
- Support: .github/SUPPORT.md
Awesome Lists containing this project
README
---
output: github_document
---```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%",
eval = FALSE
)
```[![GitHub release (latest by date)](https://img.shields.io/github/v/release/Public-Health-Scotland/slfhelper)](https://github.com/Public-Health-Scotland/slfhelper/releases/latest)
[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
[![R-CMD-check](https://github.com/Public-Health-Scotland/slfhelper/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/Public-Health-Scotland/slfhelper/actions/workflows/R-CMD-check.yaml)
[![codecov](https://codecov.io/gh/Public-Health-Scotland/slfhelper/branch/production/graph/badge.svg?token=ev2n04MPNG)](https://codecov.io/gh/Public-Health-Scotland/slfhelper)# slfhelper
The goal of slfhelper is to provide some easy-to-use functions that make working with the Source Linkage Files as painless and efficient as possible. It is only intended for use by PHS employees and will only work on the PHS R infrastructure.
## Installation
The simplest way to install to the PHS Posit Workbench environment is to use the [PHS Package Manager](https://ppm.publichealthscotland.org/client/#/repos/3/packages/slfhelper), this will be the default setting and means you can install `slfhelper` as you would any other package.
``` {r package_install_ppm}
install.packages("slfhelper")
```If this doesn't work you can install it directly from GitHub, there are a number of ways to do this, we recommend the [{`pak`} package](https://pak.r-lib.org/).
```{r package_install_github}
# Install pak (if needed)
install.packages("pak")# Use pak to install slfhelper
pak::pak("Public-Health-Scotland/slfhelper")
```## Usage
### Read a file
**Note:** Reading a full file is quite slow and will use a lot of memory, we would always recommend doing a column selection to only keep the variables that you need for your analysis. Just doing this will dramatically speed up the read time.
We provide some data snippets to help with column selection and filtering.
```{r helper_data}
library(slfhelper)# Get a list of the variables in a file
ep_file_vars
indiv_file_vars# See a lookup of Partnership names to HSCP_2018 codes
View(partnerships)# See a list with descriptions for the recids
View(recids)
``````{r read_files}
library(slfhelper)# Read certain variables
# It's much faster to choose variables like this
indiv_1718 <- read_slf_individual(year = "1718", col_select = c("anon_chi", "hri_scot"))# Read multiple years
# This will use dplyr::bind_rows() and return the files added together as a single tibble
episode_data <- read_slf_episode(
year = c("1516", "1617", "1718", "1819"),
col_select = c("anon_chi", "yearstay")
)# Read only data for a certain partnership (HSCP_2018 code)
# This can be a single partnership or multiple by supplying a vector e.g. c(...)
indiv_1718 <- read_slf_individual(
year = "1718",
partnerships = "S37000001", # Aberdeen City
col_select = c("anon_chi", "hri_scot")
)# Read only data for a certain recid
# This can be a single recid or multiple by supplying a vector e.g. c(...)
ep_1718 <- read_slf_episode("1718", recid = c("01B", "GLS"), col_select = c("anon_chi", "yearstay"))
```
The above options for reading files can (and should) be combined if required.### Match on CHI numbers to Anon_CHI (or vice versa)
```{r chi_matching}
library(slfhelper)# Add real CHI numbers to a SLF
ep_1718 <- read_slf_episode(c("1718", "1819", "1920"),
col_select = c("year", "anon_chi", "demographic_cohort")
) %>%
get_chi()# Change chi numbers from the data above back to anon_chi
ep_1718_anon <- ep_1718 %>%
get_anon_chi(chi_var = "chi")# Add anon_chi to the cohort sample
chi_cohort <- chi_cohort %>%
get_anon_chi(chi_var = "upi_number")
```