{"id":13297009,"url":"https://github.com/otiai10/hotsub","last_synced_at":"2025-10-16T22:38:10.838Z","repository":{"id":57554268,"uuid":"113541988","full_name":"otiai10/hotsub","owner":"otiai10","description":"Command line tool to run batch jobs concurrently with ETL framework on AWS or other cloud computing resources","archived":false,"fork":false,"pushed_at":"2018-11-14T02:42:31.000Z","size":290,"stargazers_count":30,"open_issues_count":11,"forks_count":5,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-10-09T18:35:43.387Z","etag":null,"topics":["aws","batch-job","bioinformatics","cwl","cwl-workflow","docker","docker-machine","etl-framework","gcp","wdl","wdl-workflow","workflow","workflow-engine"],"latest_commit_sha":null,"homepage":"https://hotsub.github.io/","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/otiai10.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2017-12-08T06:51:33.000Z","updated_at":"2025-09-30T06:41:34.000Z","dependencies_parsed_at":"2022-08-28T07:40:47.240Z","dependency_job_id":null,"html_url":"https://github.com/otiai10/hotsub","commit_stats":null,"previous_names":["otiai10/awsub"],"tags_count":21,"template":false,"template_full_name":null,"purl":"pkg:github/otiai10/hotsub","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/otiai10%2Fhotsub","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/otiai10%2Fhotsub/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/otiai10%2Fhotsub/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/otiai10%2Fhotsub/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/otiai10","download_url":"https://codeload.github.com/otiai10/hotsub/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/otiai10%2Fhotsub/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279011854,"owners_count":26085007,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-12T02:00:06.719Z","response_time":53,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws","batch-job","bioinformatics","cwl","cwl-workflow","docker","docker-machine","etl-framework","gcp","wdl","wdl-workflow","workflow","workflow-engine"],"created_at":"2024-07-29T17:21:20.980Z","updated_at":"2025-10-16T22:38:10.794Z","avatar_url":"https://github.com/otiai10.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# hotsub [![Build Status](https://travis-ci.org/otiai10/hotsub.svg?branch=master)](https://travis-ci.org/otiai10/hotsub) [![Paper Status](http://joss.theoj.org/papers/f1e4470e4831caa4252427cec8c009a8/status.svg)](http://joss.theoj.org/papers/f1e4470e4831caa4252427cec8c009a8)\n\nThe simple batch job driver on AWS and GCP. (Azure, OpenStack are coming soon)\n\n```sh\nhotsub run \\\n  --script ./star-alignment.sh \\\n  --tasks ./star-alignment-tasks.csv \\\n  --image friend1ws/star-alignment \\\n  --aws-ec2-instance-type t2.2xlarge \\\n  --verbose\n```\n\nIt will\n\n- execute workflow described in `star-alignment.sh`\n- for each samples specified in `star-alignment.csv`\n- in `friend1ws/star-alignment` docker containers\n- on EC2 instances of type `t2.2xlarge`\n\nand automatically upload the output files to S3 and clean up EC2 instances after all.\n\nSee **[Documentation](https://hotsub.github.io/)** for more details.\n\n# Why you use `hotsub`\n\nThere are 3 points why `hotsub` is made and why you use it\n\n1. **No-need to setup your cloud on web consoles:**\n    - Since `hotsub` uses pure EC2 or GCE instances, you don't have to configure AWS Batch nor Dataflow on messy web consoles\n2. **Multi-platforms with the same interface of command line:**\n    - You can switch AWS and GCP as you like only with `--provider` option of `run` command (of course you need to have credentials on your local machine)\n3. **ExTL framework available:**\n    - In some cases of bio-informatics, the problem is how to handle common and huge refrence genome. `hotsub` suggests and implements \u003cu\u003e[`ExTL` framework](https://hotsub.github.io/etl-and-extl)\u003c/u\u003e.\n\n# Installation\n\nCheck **[Getting Started](https://hotsub.github.io/getting-started)** on **[GitHub Pages](https://hotsub.github.io)**\n\n# Commands\n\n```sh\nNAME:\n   hotsub - command line to run batch computing on AWS and GCP with the same interface\n\nUSAGE:\n   hotsub [global options] command [command options] [arguments...]\n\nVERSION:\n   0.10.0\n\nDESCRIPTION:\n   Open-source command-line tool to run batch computing tasks and workflows on backend services such as Amazon Web Services.\n\nCOMMANDS:\n     run       Run your jobs on cloud with specified input files and any parameters\n     init      Initialize CLI environment on which hotsub runs\n     template  Create a template project of hotsub\n     help, h   Shows a list of commands or help for one command\n\nGLOBAL OPTIONS:\n   --help, -h     show help\n   --version, -V  print the version\n```\n\n## Available options for `run` command\n\n\n```sh\n% hotsub run -h\nNAME:\n   hotsub run - Run your jobs on cloud with specified input files and any parameters\n\nUSAGE:\n   hotsub run [command options] [arguments...]\n\nDESCRIPTION:\n   Run your jobs on cloud with specified input files and any parameters\n\nOPTIONS:\n   --verbose, -v                     Print verbose log for operation.\n   --log-dir value                   Path to log directory where stdout/stderr log files will be placed (default: \"${cwd}/logs/${time}\")\n   --concurrency value, -C value     Throttle concurrency number for running jobs (default: 8)\n   --provider value, -p value        Job service provider, either of [aws, gcp, vbox, hyperv] (default: \"aws\")\n   --tasks value                     Path to CSV of task parameters, expected to specify --env, --input, --input-recursive and --output-recursive. (required)\n   --image value                     Image name from Docker Hub or other Docker image service. (default: \"ubuntu:14.04\")\n   --script value                    Local path to a script to run inside the workflow Docker container. (required)\n   --shared value, -S value          Shared data URL on cloud storage bucket. (e.g. s3://~)\n   --keep                            Keep instances created for computing event after everything gets done\n   --env value, -E value             Environment variables to pass to all the workflow containers\n   --disk-size value                 Size of data disk to attach for each job in GB. (default: 64)\n   --shareddata-disksize value       Disk size of shared data instance (in GB) (default: 64)\n   --aws-region value                AWS region name in which AmazonEC2 instances would be launched (default: \"ap-northeast-1\")\n   --aws-ec2-instance-type value     AWS EC2 instance type. If specified, all --min-cores and --min-ram would be ignored. (default: \"t2.micro\")\n   --aws-shared-instance-type value  Shared Instance Type on AWS (default: \"m4.4xlarge\")\n   --aws-vpc-id value                VPC ID on which computing VMs are launched\n   --aws-subnet-id value             Subnet ID in which computing VMs are launched\n   --google-project value            Project ID for GCP\n   --google-zone value               GCP service zone name (default: \"asia-northeast1-a\")\n   --cwl value                       CWL file to run your workflow\n   --cwl-job value                   Parameter files for CWL\n   --wdl value                       WDL file to run your workflow\n   --wdl-job value                   Parameter files for WDL\n   --include value                   Local files to be included onto workflow container\n```\n\n# Contact\n\nTo make it transparent, ask any question from this link.\n\nhttps://github.com/otiai10/hotsub/issues\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fotiai10%2Fhotsub","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fotiai10%2Fhotsub","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fotiai10%2Fhotsub/lists"}