{"id":15992826,"url":"https://github.com/manics/nextflow-aws-batch-tutorial","last_synced_at":"2026-03-19T19:35:47.578Z","repository":{"id":146023383,"uuid":"359903186","full_name":"manics/nextflow-aws-batch-tutorial","owner":"manics","description":"A basic example of using Nextflow with AWS Batch","archived":false,"fork":false,"pushed_at":"2021-06-16T16:13:46.000Z","size":4,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-01-22T00:46:46.796Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Nextflow","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/manics.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-04-20T17:46:07.000Z","updated_at":"2021-06-16T16:13:48.000Z","dependencies_parsed_at":null,"dependency_job_id":"529ab2d8-de09-4ab7-8485-764cdebac8e2","html_url":"https://github.com/manics/nextflow-aws-batch-tutorial","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/manics%2Fnextflow-aws-batch-tutorial","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/manics%2Fnextflow-aws-batch-tutorial/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/manics%2Fnextflow-aws-batch-tutorial/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/manics%2Fnextflow-aws-batch-tutorial/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/manics","download_url":"https://codeload.github.com/manics/nextflow-aws-batch-tutorial/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243713415,"owners_count":20335567,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-08T06:41:19.223Z","updated_at":"2026-01-03T11:06:18.682Z","avatar_url":"https://github.com/manics.png","language":"Nextflow","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Running nextflow jobs on AWS Batch\n\nNotes on getting a Nextflow pipeline to run on AWS Batch\nhttps://www.nextflow.io/docs/latest/awscloud.html\n\nNote Nextflow [does not support Fargate](https://groups.google.com/g/nextflow/c/JFneg8d3x2w?pli=1), so you must use `EC2` or `EC2_SPOT` types.\n\n\n## Setting up a batch queue\n\nCreate an execution environment https://docs.aws.amazon.com/cli/latest/reference/batch/create-compute-environment.html\n\nGet (or create) subnets:\n\n    aws ec2 describe-subnets --query 'Subnets[].SubnetId'\n\nGet the default security group (or alternative create a new one):\n\n    aws ec2 describe-security-groups --group-names default\n\nGet the AWS Batch role ARN (this can be automatically created through the AWS console by creating and deleting a batch compute environment but you can also [create it manually](https://docs.aws.amazon.com/batch/latest/userguide/service_IAM_role.html)).\n\n    aws iam get-role --role-name AWSServiceRoleForBatch\n\nCheck you have the AWS ECS instance and spot fleet roles.\nThese can be automatically created through the AWS console by creating and deleting an ECS spot cluster but you can also create it manually: [ECS instance role](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/instance_IAM_role.html), [ECS spot fleet role](https://docs.aws.amazon.com/batch/latest/userguide/spot_fleet_IAM_role.html).\n\n    aws iam get-role --role-name ecsInstanceRole\n    aws iam get-role --role-name ecsSpotFleetRole\n\nHowever the `ecsInstanceRole` does not contain the S3 permissions required by Nextflow, so you either need to augment that role, or preferably create a new role and instance profile `nextflowEcsInstanceRole`:\n\n    aws iam create-role --role-name nextflowEcsInstanceRole --assume-role-policy-document file://nextflowEcsInstanceRole-assume-role-policy.json\n    aws iam attach-role-policy --role-name nextflowEcsInstanceRole --policy-arn arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceforEC2Role\n    aws iam attach-role-policy --role-name nextflowEcsInstanceRole --policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess\n\n    aws iam create-instance-profile --instance-profile-name nextflowEcsInstanceRole\n    aws iam add-role-to-instance-profile --instance-profile-name nextflowEcsInstanceRole --role-name nextflowEcsInstanceRole\n\nThe above role has full S3 access, in production you may want to limit access to just one bucket.\n\nEdit [`batch-compute-environment.json`](./batch-compute-environment.json), replace:\n  - `SUBNET_IDS`\n  - `SECURITY_GROUP_IDS`\n  - `AWS_BATCH_SERVICE_ROLE_ARN`\n\nNow create the compute environment:\n\n    aws batch create-compute-environment --cli-input-json file://batch-compute-environment-spot.json\n\nCreate the job queue\n\n   aws batch create-job-queue --job-queue-name TEST-nextflow-batch-queue --state ENABLED --priority 1 --compute-environment-order order=1,computeEnvironment=TEST-nextflow-batch-compute-m4\n\n\n## Storage bucket\n\nNextflow with AWS Batch requires an S3 location to store its outputs.\nIf you don't already have a location create a new bucket:\n\n    aws s3 mb s3://BUCKET_NAME\n\n\n## Nextflow task container\n\nNextflow on AWS Batch requires the AWS CLI to be present either in the Docker image used for executing tasks, or in the Docker Host AMI. The latter is recommended so you can use unmodified task images for executing tasks, but for now build an image that includes the AWS CLI:\n\n    docker build -t \u003cdocker-hub-username\u003e/nextflow-test:latest ./docker-image\n    docker push \u003cdocker-hub-username\u003e/nextflow-test:latest\n\nEdit the `container` lines in [`tutorial.nf`](./`tutorial.nf`) to `\u003cdocker-hub-username\u003e/nextflow-test:latest`.\n\n\n## Running\n\n    nextflow run tutorial.nf -bucket-dir s3://BUCKET_NAME/some/path\n\nNote if you are using temporary AWS session credentials then [setting them with environment variables (`AWS_ACCESS_KEY_ID` `AWS_SECRET_ACCESS_KEY` `AWS_SESSION_TOKEN`) does not work](https://github.com/nextflow-io/nextflow/issues/1724). Instead you should add the temporary credentials to your `~/.aws/credentials` file and set `AWS_PROFILE=\u003cprofile-name\u003e`.\n\nYou can optionally [enable tracing](https://www.nextflow.io/docs/latest/tracing.html) by adding flags `-with-report out.html` and/or `-with-trace`.\n\n\n## Fetching results\n\nList all files in the S3 bucket recursively:\n\n    aws s3 ls --recursive s3://BUCKET_NAME/some/path\n\nCopy all files\n\n    aws cp --recursive s3://BUCKET_NAME/some/path dest\n\n## Clean up\n\nDelete the compute environment\n\n    aws batch update-job-queue --job-queue TEST-nextflow-batch-queue --state DISABLED\n    aws delete-job-queue --job-queue TEST-nextflow-batch-queue\n    aws batch update-compute-environment --compute-environment TEST-nextflow-batch-compute-m4 --state DISABLED\n    aws batch delete-compute-environment --compute-environment TEST-nextflow-batch-compute-m4\n\n\n## Additional options\n- Specify a launch template in the compute environment to custommise an AMI at launch time without rebuilding\nhttps://docs.aws.amazon.com/batch/latest/userguide/launch-templates.html\n- You can set a container or AWS job definition in `nextflow.config` instead of in the nextflow file.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmanics%2Fnextflow-aws-batch-tutorial","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmanics%2Fnextflow-aws-batch-tutorial","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmanics%2Fnextflow-aws-batch-tutorial/lists"}