{"id":22051592,"url":"https://github.com/parlaynu/studio-aws-batch","last_synced_at":"2026-04-16T05:01:59.590Z","repository":{"id":218844792,"uuid":"746385628","full_name":"parlaynu/studio-aws-batch","owner":"parlaynu","description":"Create an AWS Batch cluster with supporting infrastructure and tooling ready to do real work.","archived":false,"fork":false,"pushed_at":"2024-01-29T10:08:36.000Z","size":343,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-18T04:30:28.512Z","etag":null,"topics":["ansible","aws-batch","aws-ec2-spot","aws-ecr","aws-endpoints","aws-iam","aws-s3","aws-vpc","docker","python3","terraform"],"latest_commit_sha":null,"homepage":"","language":"HCL","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/parlaynu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-01-21T22:09:04.000Z","updated_at":"2024-07-08T03:38:56.000Z","dependencies_parsed_at":"2024-07-08T05:44:34.723Z","dependency_job_id":"a652ed54-91a2-4963-b604-fc0bd612dde4","html_url":"https://github.com/parlaynu/studio-aws-batch","commit_stats":null,"previous_names":["parlaynu/studio-aws-batch"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/parlaynu/studio-aws-batch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parlaynu%2Fstudio-aws-batch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parlaynu%2Fstudio-aws-batch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parlaynu%2Fstudio-aws-batch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parlaynu%2Fstudio-aws-batch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/parlaynu","download_url":"https://codeload.github.com/parlaynu/studio-aws-batch/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parlaynu%2Fstudio-aws-batch/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31872036,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-15T15:24:51.572Z","status":"online","status_checked_at":"2026-04-16T02:00:06.042Z","response_time":69,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ansible","aws-batch","aws-ec2-spot","aws-ecr","aws-endpoints","aws-iam","aws-s3","aws-vpc","docker","python3","terraform"],"created_at":"2024-11-30T15:09:53.350Z","updated_at":"2026-04-16T05:01:59.551Z","avatar_url":"https://github.com/parlaynu.png","language":"HCL","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AWS Batch Cluster\n\nThis repository builds an [AWS Batch](https://aws.amazon.com/batch/) cluster. \n\nThere are a lot of things to put in place to get AWS Batch ready to do real work. This project gets you there\nby building the necessary core infrastructure (job definitions, queues, compute environments) and supporting \ninfrastructure (VPC with subnets and gateways ready to host EC2 spot instances, S3 buckets, ECR repository, \nIAM policies and roles so jobs can access the resources). It also provides some basic tools to submit and query jobs.\n\nThe default setup builds a demo or development cluster. This is a minimal setup to test the system. There\nare configuration options for more advanced, production ready configurations which are described in the\n[Production Cluster](#production-cluster) section below.\n\nThe demo cluster uses [docker engine](https://docs.docker.com/engine/install/) to build management containers \nwith the necessary thirdparty software (eg. terraform and ansible) and at the correct versions. This works \nwell for the demo cluster, but probably wouldn't work for a production cluster\n\nTo run the submission tools, you will need a recent python3 installation, and will probably use virtual\nenvironments to install the software and dependencies.\n\nAnd of course, you also need an aws account with a working configuration for the \n[cli tools](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-quickstart.html).\n\nThe [Key Components](#key-components) section below gives an overview of the cluster's key components and\nhow they work together to run a job.\n\nThe [Demo Cluster](#demo-cluster) section describes how to build the demo or development cluster. This\nis a simple cluster that is useful to get familiar with the system, but isn't really a production\nready system. It also describes how to define, build and run jobs.\n\nFinally, the [Production Cluster](#production-cluster) section describes how to upgrade the demo\ncluster to make it more scalable and resilient for production systems.\n\n## Key Components\n\nThe key components of the cluster are shown in the diagram below.\n\n![Cluster Key Components](docs/cluster_key_components.png \"Cluster Key Components\")\n\nThese components are all built by the terraform configurations in this repository; these are the same\nfor both the demo cluster and any production cluster. See the [AWS Batch](https://aws.amazon.com/batch/)\ndocumentation for details on how they work together.\n\nThere are a lot of ways to build and use batch, but the workflow that is setup in this repository\nis that each job will have an associated docker container image in ECR that contains\nall the software needed to run the job, will load data from an S3 bucket, will use a workspace S3\nbucket for intermediate data (optional), and will write results to the output S3 Bucket.\n\nThis terraform configurations create the necessary IAM policies and roles for the compute environments to \nhave correct access to the buckets, but accessing the buckets is up to the application itself.\n\nThe compute environments vary substantially between demo and production environments and are described\nin the sections below.\n\n## Demo Cluster\n\n### Overview\n\nThe compute environment for the demo cluster is shown below:\n\n![Demo Compute Environment](docs/demo_compute_env.png \"Demo Compute Environment\")\n\nThe key things to note are:\n\n* runs in single availability zone - not resilient\n* all access to S3 and ECR is via a single gateway server - not resilient, not scalable\n* compute resources are EC2 SPOT instances\n\nThe issues noted above can be fixed with simple configuration variables to be multi-zone and to use\nendpoints to access resources. This is all described in the [production cluster](#production-cluster)\nsection.\n\n### Building the Cluster\n\nThe steps to build the cluster are described in the following documents:\n\n* [bootstrapping](docs/00-bootstrapping.md) the environment\n* building the [cluster](docs/01-cluster.md) infrastructure\n* building the [base containers](docs/02-base-containers.md)\n\n### Creating and Running Jobs\n\nThere are a number of things to do to create and run a job. They are described in [this](docs/job-helloworld.md)\ndocument using the `helloworld` job as an example.\n\n## Production Cluster\n\nThe default build can be customized by overriding variables in the [variables.tf](01-cluster/cluster/variables.tf) \nfile. This is done by creating a `terraform.tfvars` file and setting the values in there. The file\n[terraform.tfvars.example](01-cluster/cluster/terraform.tfvars.example) can be used as a starting point\nfor customizing the build and contains the variables most likely to be customized and an explanation of each.\n\nThe variables fall into three categories:\n\n* service endpoints\n* multiple availability zones\n* user supplied buckets\n\n## Tools\n\nDocumentation for the tools can be found [here](docs/tools.md)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fparlaynu%2Fstudio-aws-batch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fparlaynu%2Fstudio-aws-batch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fparlaynu%2Fstudio-aws-batch/lists"}