{"id":32605749,"url":"https://github.com/rworkflow/rcwlworkshop","last_synced_at":"2026-02-25T01:36:23.755Z","repository":{"id":184045681,"uuid":"671220370","full_name":"rworkflow/RcwlWorkshop","owner":"rworkflow","description":null,"archived":false,"fork":false,"pushed_at":"2023-08-01T12:11:24.000Z","size":3476,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"devel","last_synced_at":"2024-05-09T07:53:09.001Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"http://rcwl.org/RcwlWorkshop/","language":"Dockerfile","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rworkflow.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-07-26T20:24:27.000Z","updated_at":"2023-07-26T20:35:51.000Z","dependencies_parsed_at":"2023-07-26T21:52:53.701Z","dependency_job_id":null,"html_url":"https://github.com/rworkflow/RcwlWorkshop","commit_stats":null,"previous_names":["rworkflow/rcwlworkshop"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/rworkflow/RcwlWorkshop","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rworkflow%2FRcwlWorkshop","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rworkflow%2FRcwlWorkshop/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rworkflow%2FRcwlWorkshop/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rworkflow%2FRcwlWorkshop/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rworkflow","download_url":"https://codeload.github.com/rworkflow/RcwlWorkshop/tar.gz/refs/heads/devel","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rworkflow%2FRcwlWorkshop/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":281793987,"owners_count":26562617,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-30T02:00:06.501Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-10-30T10:59:21.751Z","updated_at":"2025-10-30T10:59:31.307Z","avatar_url":"https://github.com/rworkflow.png","language":"Dockerfile","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Use R to Create and Execute Reproducible CWL Workflows for Genomic Research\n\nAuthors:\n\tQian Liu ^[Roswell Park Comprehensive Cancer Center],\n    Another Author^[Roswell Park Comprehensive Cancer Center].\n    \u003cbr/\u003e\nLast modified: July 27, 2023.\n\n### Pre-requisites\n\n- Basic familiarity with DNA-seq data variant calling \n- Interest of using workflow language \n\n### Workshop Participation\n\nThe workshop format is a 45 minute session consisting of hands-on demos, exercises and Q\u0026A.\n\n### R / Bioconductor packages used\n- ReUseData\n- RcwlPipelines\n- Rcwl\n\n## Workshop: Somatic variant calling\n\nFor the somatic variant calling, we will need to prepare the following: \n\n- Experiment data \n  - In the format of `.bam`, `.bam.bai` files\n- ReUsable Genomic data \n  - reference sequence file (`b37` or `hg38`)\n  - Panel of Normals (PON) [ref](https://gatk.broadinstitute.org/hc/en-us/articles/360035890631-Panel-of-Normals-PON-)\n- Software tool: \n  - Here we use `Mutect2`to Call somatic SNVs and indels via local assembly of\n    haplotypes. [ref](https://gatk.broadinstitute.org/hc/en-us/articles/360037593851-Mutect2)\n\nWe also want to have the data analysis workflow to be reproducible:  \n\n1. Software tool properly tracked for version, docker image etc.\n2. Data provenance properly tracked for public data resources for: \n\t- workflow reproducibility\n\t- later reuse in other similar projects\n\nThe first can be solved by workflow languages (e.g., CWL, WDL,\nsnakemake, etc.). There is no similar tools for the 2nd task. \n\nIn this workshop, I will demostrate two _Bioconductor_ packages:\n`Rcwl` as an R interface for `CWL`, and `RcwlPipelines` for \u003e200\npre-built bioinformatics tools and best practice pipelines in _R_,\nthat are easily usable and highly customizable. I will also introduce\na _R/Bioconductor_ package `ReUseData` for the management of reusable\ngenomic data.\n\nWith these tools, we should be able to conduct reproducible data\nanalysis using commonly used bioinformatics tools (including\ncommand-line based tools and _R/Bioconductor_ packages) and validated,\nbest practice workflows (based on workflow languages such as CWL)\nwithin a unified _R_ programming environment.\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frworkflow%2Frcwlworkshop","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frworkflow%2Frcwlworkshop","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frworkflow%2Frcwlworkshop/lists"}