{"id":15118977,"url":"https://github.com/Bayer-Group/BayerCLAW","last_synced_at":"2025-09-28T01:31:32.169Z","repository":{"id":41401280,"uuid":"353039568","full_name":"Bayer-Group/BayerCLAW","owner":"Bayer-Group","description":"BayerCLAW workflow orchestration system for AWS","archived":false,"fork":false,"pushed_at":"2025-08-20T14:49:22.000Z","size":1670,"stargazers_count":22,"open_issues_count":1,"forks_count":9,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-09-09T14:02:05.918Z","etag":null,"topics":["aws","bayer-not-classified","bayer-reg-none","beat-not-applicable","pipeline","workflow"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Bayer-Group.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":"MAINTAINERS","copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2021-03-30T14:55:51.000Z","updated_at":"2025-08-20T14:48:08.000Z","dependencies_parsed_at":"2024-02-13T16:29:20.115Z","dependency_job_id":"61add9ae-4eaf-4f80-a9c6-0333d2694ab2","html_url":"https://github.com/Bayer-Group/BayerCLAW","commit_stats":null,"previous_names":[],"tags_count":21,"template":false,"template_full_name":null,"purl":"pkg:github/Bayer-Group/BayerCLAW","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Bayer-Group%2FBayerCLAW","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Bayer-Group%2FBayerCLAW/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Bayer-Group%2FBayerCLAW/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Bayer-Group%2FBayerCLAW/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Bayer-Group","download_url":"https://codeload.github.com/Bayer-Group/BayerCLAW/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Bayer-Group%2FBayerCLAW/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":277315117,"owners_count":25797567,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-27T02:00:08.978Z","response_time":73,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws","bayer-not-classified","bayer-reg-none","beat-not-applicable","pipeline","workflow"],"created_at":"2024-09-26T01:53:41.320Z","updated_at":"2025-09-28T01:31:27.161Z","avatar_url":"https://github.com/Bayer-Group.png","language":"Python","funding_links":[],"categories":["Ranked by starred repositories"],"sub_categories":[],"readme":"# Bayer CLoud Automated Workflows (BayerCLAW)\n\nBayerCLAW is a workflow orchestration system targeted at bioinformatics pipelines.\nA workflow consists of a sequence of computational steps, each of which is captured in a Docker container.\nSome steps may parallelize work across many executions of the same container (scatter/gather pattern).\n\nA workflow is described in a YAML file.\nThe BayerCLAW compiler uses AWS CloudFormation to transform the workflow description into AWS resources used by the workflow.\nThis includes an AWS StepFunctions state machine that represents the sequence of steps in the workflow.\n\nA workflow typically takes several parameters, such as sample IDs or paths to input files.\nOnce the workflow definition has been deployed, the workflow can be executed by copying a JSON file with the\nexecution parameters to a \"launcher\" S3 bucket, which is constructed by BayerCLAW.\nThe workflow state machine uses AWS Batch to actually run the Docker containers, in the proper order.\n\n## Documentation\n\n- [Quick start -- deploying a BayerCLAW workflow](doc/quick-start.md)\n- [Tutorial -- detailed example of writing, deploying, and debugging](doc/tutorial.md)\n\n- [Installing BayerCLAW into a new AWS account](doc/installation.md)\n- [The BayerCLAW language reference](doc/language.md)\n- [The BayerCLAW language -- scatter/gather](doc/scatter.md)\n- [The BayerCLAW language -- QC checks](doc/qc.md)\n- [The BayerCLAW language -- subpipes](doc/subpipes.md)\n- [Runtime environment and Docker guidelines](doc/runtime_env.md) for steps\n- [BayerCLAW notifications](doc/notifications.md)\n\nThe [doc/](doc/) directory of this repo contains all the pages linked above.\n\n## Key components of BayerCLAW\n\n### The workflow definition\n\nThe BayerCLAW workflow template is a JSON- or YAML-formatted file describing the processing steps of the pipeline.\nHere is an example of a very simple, one-step workflow:\n\n```YAML\nTransform: BC2_Compiler\n\nRepository: s3://example-bucket/hello-world/${job.SAMPLE_ID}\n\nSteps:\n  - hello:\n      image: docker.io/library/ubuntu\n      commands:\n        - echo \"Hello world! This is job ${job.SAMPLE_ID}!\"\n```\n\n### The repository\n\nThe repository is a path within an S3 bucket where a given workflow stores its output files, such as `s3://generic-workflow-bucket/my-workflow-repo/`.\nThe repo is typically parameterized with some job-specific unique ID, so that each execution of the workflow is kept separate.\nFor example, `s3://generic-workflow-bucket/my-workflow-repo/job12345/`\n\n### Job data file\nThe job data file contains data needed for a single pipeline execution.\nThis data must be encoded as a flat JSON object with string keys and string values.\nEven integer or float values should be quoted as strings.\n\nCopying the job data file into the launcher bucket will trigger an execution of the pipeline.\nOverwriting the job data file, even with the same contents, will trigger another execution.\n\n#### Sample job data file\n```json5\n{\n  \"SAMPLE_ID\": \"ABC123\",\n  \"READS1\": \"s3://workflow-bucket/inputs/reads1.fq\",\n  \"READS2\": \"s3://workflow-bucket/inputs/reads2.fq\"\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FBayer-Group%2FBayerCLAW","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FBayer-Group%2FBayerCLAW","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FBayer-Group%2FBayerCLAW/lists"}