{"id":25865303,"url":"https://github.com/jcaperella29/bwa_samtools_cwl_flow","last_synced_at":"2026-03-07T00:01:36.486Z","repository":{"id":262292229,"uuid":"886797919","full_name":"jcaperella29/BWA_SAMTOOLS_CWL_FLOW","owner":"jcaperella29","description":"a common language workflow , workflow to process Fastq files","archived":false,"fork":false,"pushed_at":"2024-11-18T01:49:43.000Z","size":10,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-02T01:37:31.011Z","etag":null,"topics":["bioinformatics","cwl-workflow","next-generation-sequencing"],"latest_commit_sha":null,"homepage":"","language":"Common Workflow Language","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jcaperella29.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-11T16:13:32.000Z","updated_at":"2025-01-08T16:46:46.000Z","dependencies_parsed_at":"2024-11-11T17:35:34.354Z","dependency_job_id":null,"html_url":"https://github.com/jcaperella29/BWA_SAMTOOLS_CWL_FLOW","commit_stats":null,"previous_names":["jcaperella29/bwa_samtools_cwl_flow"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/jcaperella29/BWA_SAMTOOLS_CWL_FLOW","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jcaperella29%2FBWA_SAMTOOLS_CWL_FLOW","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jcaperella29%2FBWA_SAMTOOLS_CWL_FLOW/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jcaperella29%2FBWA_SAMTOOLS_CWL_FLOW/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jcaperella29%2FBWA_SAMTOOLS_CWL_FLOW/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jcaperella29","download_url":"https://codeload.github.com/jcaperella29/BWA_SAMTOOLS_CWL_FLOW/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jcaperella29%2FBWA_SAMTOOLS_CWL_FLOW/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30204109,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-06T19:07:06.838Z","status":"ssl_error","status_checked_at":"2026-03-06T18:57:34.882Z","response_time":250,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","cwl-workflow","next-generation-sequencing"],"created_at":"2025-03-02T01:34:23.173Z","updated_at":"2026-03-07T00:01:36.466Z","avatar_url":"https://github.com/jcaperella29.png","language":"Common Workflow Language","funding_links":[],"categories":[],"sub_categories":[],"readme":"This repository contains a Common Workflow Language (CWL) workflow that:\n\nAligns paired-end reads to a reference genome using BWA.\nSorts the alignment output with Samtools.\nConverts the sorted BAM file to BED format using Bedtools.\nThis setup is ideal for bioinformatics pipelines requiring reproducibility, portability, and flexibility across sequencing data types.\n\n📋 Project Structure\nplaintext\nCopy code\nproject-root/\n├── bwa_samtools_bedtools_workflow.cwl # Main CWL workflow for BWA, Samtools, and Bedtools steps\n├── bwa_mem_tool.cwl                   # BWA alignment tool definition\n├── samtools_sort_tool.cwl             # Samtools sorting tool definition\n├── bedtools_bamtobed_tool.cwl         # Bedtools BAM to BED conversion tool definition\n├── example-data/                      # (Optional) Example FASTA/FASTQ files for testing\n└── docs/                              # Additional documentation or notes\n🚀 Getting Started\nPrerequisites\nCWLTool: Install cwltool for running CWL workflows.\nbash\nCopy code\npip install cwltool\nBWA: Ensure BWA is installed and accessible. BWA Installation Guide\nSamtools: Ensure Samtools is installed and accessible. Samtools Installation Guide\nBedtools: Ensure Bedtools is installed and accessible. Bedtools Installation Guide\nSetup\nClone the repository:\n\nbash\nCopy code\ngit clone https://github.com/yourusername/your-repo-name.git\ncd your-repo-name\nPrepare Reference Genome:\n\nUse BWA to index the reference genome:\n\nbash\nCopy code\nbwa index path/to/chr22.fa\nThis will generate necessary index files (.bwt, .pac, .ann, .amb, .sa) in the same directory as chr22.fa.\n\nOrganize Input Data:\n\nEnsure the reference genome and paired-end FASTQ files are in appropriate paths.\n🧬 Running the Workflow\nUse the following command to execute the workflow:\n\nbash\nCopy code\ncwltool bwa_samtools_bedtools_workflow.cwl \\\n  --ref_genome /full/path/to/chr22.fa \\\n  --ref_genome_bwt /full/path/to/chr22.fa.bwt \\\n  --ref_genome_pac /full/path/to/chr22.fa.pac \\\n  --ref_genome_ann /full/path/to/chr22.fa.ann \\\n  --ref_genome_amb /full/path/to/chr22.fa.amb \\\n  --ref_genome_sa /full/path/to/chr22.fa.sa \\\n  --read1 /full/path/to/sample_R1.fastq \\\n  --read2 /full/path/to/sample_R2.fastq\nExample Command\nbash\nCopy code\ncwltool --no-container /path/to/bwa_samtools_bedtools_workflow.cwl \\\n  --ref_genome /path/to/chr22.fa \\\n  --ref_genome_bwt /path/to/chr22.fa.bwt \\\n  --ref_genome_pac /path/to/chr22.fa.pac \\\n  --ref_genome_ann /path/to/chr22.fa.ann \\\n  --ref_genome_amb /path/to/chr22.fa.amb \\\n  --ref_genome_sa /path/to/chr22.fa.sa \\\n  --read1 /path/to/sample_R1.fastq \\\n  --read2 /path/to/sample_R2.fastq\nOutput\nThe workflow produces:\n\nA sorted BAM file (sorted_bam_output) created by Samtools, which is the aligned and sorted sequence data.\nA BED file (bed_output) created by Bedtools, which converts the sorted BAM alignment to BED format.\n📄 License\nThis project is licensed under the MIT License. See the LICENSE file for more details.\n\n🛠️ Additional Notes\nEnsure all index files are in the same directory as the reference genome.\nThe workflow is compatible with cloud platforms and HPC environments supporting CWL.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjcaperella29%2Fbwa_samtools_cwl_flow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjcaperella29%2Fbwa_samtools_cwl_flow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjcaperella29%2Fbwa_samtools_cwl_flow/lists"}