https://github.com/veupathdb/study-wrangler
Code to keep study data tidy
https://github.com/veupathdb/study-wrangler
Last synced: 5 months ago
JSON representation
Code to keep study data tidy
- Host: GitHub
- URL: https://github.com/veupathdb/study-wrangler
- Owner: VEuPathDB
- License: apache-2.0
- Created: 2022-09-30T18:24:51.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2025-08-04T12:29:21.000Z (11 months ago)
- Last Synced: 2025-08-04T15:47:28.132Z (11 months ago)
- Language: R
- Size: 526 KB
- Stars: 0
- Watchers: 19
- Forks: 0
- Open Issues: 25
-
Metadata Files:
- Readme: README.org
- License: LICENSE
Awesome Lists containing this project
README
#+TITLE: README
study.wranger R package for tidying data in a way compatible with VEuPathDB workflows for EDA loading.
* Install in existing R environment or RStudio
#+BEGIN_SRC R
# Install the remotes package
install.packages("remotes") # if you don't already have 'remotes'
# Install (or upgrade) the study.wrangler package from GitHub
remotes::install_github("VEuPathDB/study-wrangler", build_vignettes = TRUE, upgrade = FALSE)
# And load all its functions into your namespace
library(study.wrangler)
# Browse the vignettes (tutorials) in your web browser
browseVignettes("study.wrangler")
# If your web browser doesn't open with the vignette index page, use one of these:
options(browser = "xdg-open") # Linux
# or
options(browser = "open") # macOS
# or
options(browser = "C:/Program Files/Google/Chrome/Application/chrome.exe") # Windows
# and then try again with
browseVignettes("study.wrangler")
#+END_SRC
** Tip: GitHub rate limiting
If you see errors like HTTP error 403 or API rate limit exceeded when installing packages from GitHub, you may be hitting the unauthenticated GitHub API limit (60 requests/hour).
To avoid this, set a GitHub Personal Access Token (PAT):
1. Generate a token via https://github.com/settings/personal-access-tokens
Only the default read-only permissions to public repositories are needed.
2. If you're using R from the command line (not RStudio), set your token in your shell config:
- zsh (default on most Macs): Add
#+BEGIN_SRC sh
export GITHUB_PAT=github_pat_...
#+END_SRC
to your =~/.zshrc=
- bash: Add it to =~/.bash_profile= or =~/.bashrc=
3. For RStudio, we recommend setting it in your R environment file via:
#+BEGIN_SRC sh
usethis::edit_r_environ()
#+END_SRC
Add the line
#+BEGIN_SRC sh
GITHUB_PAT=github_pat_...
#+END_SRC
Save and restart your R-session
** Tip: Edit and run the vignette as a notebook
To experiment with the vignette code interactively:
1. Clone the source repo:
#+BEGIN_SRC sh
git clone https://github.com/VEuPathDB/study-wrangler.git
#+END_SRC
2. Open the =.Rmd= file in RStudio (File -> Open):
#+BEGIN_SRC
study-wrangler/vignettes/cleaning-and-preparing-basics.Rmd
#+END_SRC
You can now run code chunks interactively and modify them/play around.
* To build a docker image
#+begin_example
docker build -t veupathdb/study-wrangler .
#+end_example
Build notes:
- add `--progress=plain` if you want to see the errors more easily
- each `remotes::install_github()` command in the Dockerfile eats into a rate limit for anonymous users at GitHub (60 per hour!) - so if you are playing around with things you almost certainly will hit the limit
- setting up credentials for a higher limit seems not worth the cost
To run in docker:
#+begin_example
docker run --rm -ti -e PASSWORD=password -p 8989:8787 veupathdb/study-wrangler
# Then in your web browser navigate to localhost:8989 and login with "rstudio" and "password"
#+end_example
To add R functions to the repo (using docker):
#+begin_example
docker run --rm -ti --name study-wrangler-dev -v $PWD:/study.wrangler -e PASSWORD=password -p 8888:8787 veupathdb/study-wrangler
# Then in your web browser navigate to localhost:8888 and login with "rstudio" and "password"
# at the RStudio prompt
> library(devtools)
# then in the file browser:
# 1. navigate into ./study.wrangler directory
# 2. Set As Working Directory (gear icon in file browser)
# 3. (or do `setwd("~/study.wrangler")` in console)
> devtools::test()
# actually just `test()` will work
# If running commands in console, you will need to do
> load_all()
# after making changes in the code.
# To update the documentation and/or NAMESPACE file
> document()
> build(path='dist')
> install()
# Note that we are not committing the man/*.Rd files to the repo at the moment
#+end_example