Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/clindet/rctl

A set of command line tools based on R and JavaScript.
https://github.com/clindet/rctl

command-line-tool ngs r workflow

Last synced: 3 months ago
JSON representation

A set of command line tools based on R and JavaScript.

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-"
)
```


Life cycle: experimental
Downloads
Version
License

# rctl

Now, there are several difficulties for next-generation sequencing (NGS) data analysis projects that needs to be solved:

- Standardized project management, directory structured,recording and checking of raw data and analysis result, standardized logging for input, output and commands
- Construction and redeployment of computing environment including all required tools, databases and other files.
- Lack of integration and unify of massive data analysis workflows.
- Lack of the unified distribution platform for various data analysis workflows (e.g. snakemake, nextflow, Galaxy, etc.).
- Reuse of workflows language codes (e.g. commands, input and output information) on other programming platform are still complicated.
- The readability and reusable will also be decreased when massive Python and R codes mixed with the workflows language codes.

This project is part of [openanno](https://github.com/openanno), and aim to integrate and develop command line tools based on R and JavaScript ecosystem.


Best practice of reproducible NGS data analysis projects

We proposed that using [node](https://nodejs.org/en/) to distribute the bioinformatics data analysis required workflows (e.g [Common workflow language (CWL) ](https://www.commonwl.org/)) or the command line scripts that created by users. The creation, update and upload of a node package are very simple. Well-tested and high-performance distribution tools of node packages, such as [npm](https://www.npmjs.com/) and [yarn](https://www.yarnpkg.com), are providing the service for more than 1,264,413 packages.

**Command line scripts supported now:**

tool | function
---|---
rdeps | Getting all `rctl` required R packages
rsession | `sessionInfo()` and `sessioninfo::session\_info()`
rinstall | Install R packages and [BioInstaller](https://github.com/rctl/BioInstaller) resources using `install.packages()` and R packages `devtools`, `BiocManager` and `BioInstaller`
rconfig | Parsing and generating json, ini, yaml, and toml format configuration files
rbashful | run bashful
rclrs | Generating colors for visulization using a theme key
rmv | Formating the file names.
ranystr | Generating any counts and any length random strings (e.g. Ies1y7fpgMVjsAyBAtTT)
rtime_stamp | Generating time stamp (e.g. 2018_11_15_22_43_25_, 2018/11/15/, 2018/11/15/22/).
rdownload | Parallel download URLs with logs
rbin | Collecting R packages inst/bin files to a directory, e.g. PATH

### R packages

- optparse
- configr
- stringi
- futile.logger
- glue
- ngstk
- BioInstaller
- devtools
- pacman
- BiocManager
- sessioninfo
- future

## Installation

```{bash eval=FALSE}
# Use conda to manage the env
conda install go nodejs \
echo 'export NODE_PATH="/path/miniconda2/lib/node_modules/"\n' >> ~/.bashrc \
&& npm install -g yarn \
&& echo 'export PATH=$PATH:~/.yarn/bin/\n' >> ~/.bashrc

# Other see https://nodejs.org/en/download/package-manager/
# For: Ubuntu
apt update
apt install -y npm

# For MacOS
brew install node
```

```{bash eval=FALSE}
npm install -g rctl
# or
yarn global add rctl

# If you not to globaly install rctl
# You need to set the PATH
echo "export rctl_ROOT=/path/node_modules/nodejs" >> ~/.bashrc
echo "export PATH=$PATH:${rctl_ROOT}/bin" >> ~/.bashrc

# Current dir is /path
npm install rctl
# or
yarn add rctl
```

## Usage

Before try your `rctl` command line tools, you need run the `rdeps` getting all the extra R packages required by `rctl`.

```{bash eval=FALSE}
# install the extra R packages used in `rctl` scripts
rdeps
```

Then you can use the `rctl` to run all sub-commands.

```{bash}
rctl -h
```

### rbashful

[bashful](https://github.com/wagoodman/bashful) is a GO program and used by `rbashful`, so you need to install it before use the `rbashful`.

**Ubuntu/Debian**

```{bash eval=FALSE}
wget https://github.com/wagoodman/bashful/releases/download/v0.0.10/bashful_0.0.10_linux_amd64.deb
sudo apt install ./bashful_0.0.10_linux_amd64.deb
```

**RHEL/Centos**

```{bash eval=FALSE}
wget https://github.com/wagoodman/bashful/releases/download/v0.0.10/bashful_0.0.10_linux_amd64.rpm
rpm -i bashful_0.0.10_linux_amd64.rpm
```

**Mac**

```{bash eval=FALSE}
brew tap wagoodman/bashful
brew install bashful
```

or download a Darwin build from the releases page.

**Go tools**

```{bash eval=FALSE}
go get github.com/wagoodman/bashful
```

![](https://raw.githubusercontent.com/wagoodman/bashful/master/doc/demo.gif)

View a `rbashful` demo [here](https://github.com/openanno/rctl/test/rbashful/rnaseq_splicing).

```{r}
source_dir <- "/Users/apple/Documents/repositories/rctl/examples/rbashful/rnaseq_splicing/02_leafcutter_majiq"

# View the cli.yaml
cat(paste0(readLines(sprintf("%s/cli.yaml", source_dir)),
collapse = "\n"), sep = "\n")

# View the env.toml
cat(paste0(readLines(sprintf("%s/env.toml", source_dir)),
collapse = "\n"), sep = "\n")

# View the submit.sh
cat(paste0(readLines(sprintf("%s/submit", source_dir)),
collapse = "\n"), sep = "\n")
```

```{bash}
rbashful -h
```

### rsession

```{bash eval=FALSE}
# Print commandline help of rsession
rsession -h

# Print rsession R document (Just like ?sessionInfo in R client)
rsession -d

# Print R sessionInfo()
# The followed three lines are equivalent.
rsession
rsession -f 1
rsession -f sessionInfo

# The followed two lines are equivalent.
rsession -f 2 -e 'include_base=TRUE'
rsession -f sessioninfo::session_info -e 'include_base=TRUE'
```
```{bash}
rsession -f 2 -e 'include_base=TRUE'

rsession -h
```

### rinstall

```{bash eval=FALSE}
# Print commandline help of rinstall
rinstall -h

# Print rinstall R document (Just like ?sessionInfo in R client)
rinstall -d

# Install CRAN R package yaml (install.package)
rinstall yaml
rinstall -f 1 yaml

# Install R package ngstk from GitHub rctl/ngstk (devtools::install_github)
rinstall -f 2 rctl/ngstk

# Install R package ngstk from GitHub rctl/ngstk (install.package)
# devtools::install_github with force = TRUE, ref = 'develop'
rinstall -f 2 -e "force = TRUE, ref = 'develop'" rctl/ngstk

# Install Bioconductor package ggtree (BiocManager)
# BiocManager::install('ggtree')
rinstall -f 3 ggtree

# Install R packages (pacman)
# pacman::p_load(ggtree)
rinstall -f 4 ggtree

# Install and download BioInstaller resources

# Show all BioInstaller default resources name
rinstall -f BioInstaller::install.bioinfo -e "show.all.names=T"
rinstall -f 5 -e "show.all.names=T"

# Show ANNOVAR refgene and avsnp versions
rinstall -f BioInstaller::install.bioinfo -e "show.all.versions=T" db_annovar_refgene
rinstall -f 5 -e "show.all.versions=T" db_annovar_avsnp

# Show ANNOVAR hg19 refgene and avsnp
rinstall -f 5 -e "download.dir='/tmp/refgene', extra.list=list(buildver='hg19')" db_annovar_refgene
rinstall -f 5 -e "download.dir='/tmp/avsnp', extra.list=list(buildver='hg19')" db_annovar_avsnp
```
```{bash}
rinstall -h
```

### rdownload

```{bash eval = FALSE}
rdownload "https://img.shields.io/npm/dm/rctl.svg,https://img.shields.io/npm/v/rctl.svg,https://img.shields.io/npm/l/rctl.svg"

rdownload "https://img.shields.io/npm/dm/rctl.svg,https://img.shields.io/npm/v/rctl.svg,https://img.shields.io/npm/l/rctl.svg" --destfiles "/tmp/rctl1.svg,rctl2.svg,rctl3.svg"

rdownload --urls "https://img.shields.io/npm/dm/rctl.svg , https://img.shields.io/npm/v/rctl.svg,https://img.shields.io/npm/l/rctl.svg" \
--destfiles "rctl1.svg,rctl2.svg,rctl3.svg" --max-cores 1
```
```{bash}
rdownload -h
```

### rconfig

```{r eval = FALSE}
# Use configr::read.config parsing json format file
# Reture the list object output
rconfig package.json
rconfig -c package.json

# Use configr::read.config parsing json format file with the custom R function
rconfig -c test.json -r 'function(x){x[["a"]] + x[["b"]]}'
rconfig -c test.json -r 'function(x){x[["a"]]}'
rconfig -c test.json -r 'function(x){x[["b"]]}'
rconfig -c test.json -r 'x[["b"]]'

# Use configr::write.config parsing json format file
rconfig -f "configr::write.config" test.json -e "config.dat=list(a=1, b=2), write.type='json'"
rconfig -f 2 test.json -e "config.dat=list(a=1, b=2), write.type='json'"
```
```{bash}
rconfig -f "configr::fetch.config" "https://raw.githubusercontent.com/Miachol/configr/master/inst/extdata/config.global.toml"

rconfig -h
```

### rclrs

```{bash eval = TRUE}
# Show default and red/blue theme colors
rclrs default
rclrs -t default
rclrs -t red_blue

# Show default theme colors (extract the first element)
rclrs -t default -r 'x[1]'

# Show all supported theme
rclrs --show-all-themes
```
```{bash}
rclrs -h
```

### rmv

```{bash eval = FALSE}
# do.rename is used to preview the new filenames
#
rmv "`ls`" -e "do.rename = F, prefix = 'prefix', suffix = 'suffix'"
rmv "`ls`" -e "do.rename = F, replace = list(old =c('-', '__'), new = c('_', '_'))"
rmv "`ls`" -e "do.rename = F, toupper = TRUE"
rmv "`ls`" -e "do.rename = F, tolower = TRUE"

rmv "`ls`" -e "do.rename=T, replace=list(old='new', new='old')"
```
```{bash}
rmv -h
```

### rtime_stamp

```{bash}
rtime_stamp

rtime_stamp -r 'x[[1]]'

rtime_stamp -r 'x[[1]][1]'

rtime_stamp -t '%Y_%d'

rtime_stamp -e "extra_flag=c('*')"

rtime_stamp -h
```

### ranystr

```{bash}
./bin/ranystr

./bin/ranystr -l 30

./bin/ranystr -l 20 -n 3
```
```{bash}
ranystr -h
```

```{bash}
# Collect system.files("extdata", "bin", package = "ngstk")
# multiple packages (i.e. ngstk,configr)
rbin ngstk

rbin --destdir /tmp/path ngstk

rbin -h
```

## How to contribute?

Please fork the [GitHub rctl repository](https://github.com/openanno/rctl), modify it, and submit a pull request to us.

## Maintainer

[Jianfeng Li](https://github.com/Miachol)

## License

MIT