Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nhatthaiquang-agilityio/kedro-aws-batch
Run a Kedro project on AWS Batch
https://github.com/nhatthaiquang-agilityio/kedro-aws-batch
aws-batch docker ecr-repositories kedro s3
Last synced: 19 days ago
JSON representation
Run a Kedro project on AWS Batch
- Host: GitHub
- URL: https://github.com/nhatthaiquang-agilityio/kedro-aws-batch
- Owner: nhatthaiquang-agilityio
- Created: 2021-01-05T03:31:31.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2021-01-28T01:43:08.000Z (almost 4 years ago)
- Last Synced: 2024-04-16T07:09:40.664Z (9 months ago)
- Topics: aws-batch, docker, ecr-repositories, kedro, s3
- Language: Python
- Homepage:
- Size: 1.59 MB
- Stars: 1
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Kedro AWS Batch
Run a Kedro Project on AWS Batch### Prerequisites
+ Docker
+ Kedro 0.16.6
+ ECR, S3 & AWS Batch
+ scikit-learn 0.23.0
+ pickle5 0.0.11### Build
+ Build image
```
example$ ./scripts/build.sh
```+ ECR Login
```
aws ecr get-login-password --region | docker login --username AWS --password-stdin .dkr.ecr..amazonaws.com
```
+ Login AWS Console and create an ECR repository+ Push image into ECR
```
docker tag [repository uri]
docker push [repository uri]
```+ Create IAM role
```
Name the newly-created IAM role `batchJobRole`.
The policy (step 3) should be added `AmazonS3FullAccess`
```+ Create AWS Batch compute environment
```
Create a managed, on-demand one named `spaceflights_env` and
let it choose to create new service and instance roles
```+ Create AWS Batch job queue
```
Create a queue named `spaceflights_queue`,
connected to your newly created compute environment `spaceflights_env`, and give it `Priority` 1.
```+ Create AWS Batch job definition
```
Create a job definition named `kedro_run`, assign it the newly created `batchJobRole` IAM role,
the container image you’ve packaged above, execution timeout of 300s and 2000MB of memory
```
```
For me: Should set the execution timeout is 900s.
It avoids the main batch to fail due to execution timeout.
```+ Run Kedro node(Run Jobs)
```
Command: kedro run --node preprocessing_companies
```+ Submit AWS Batch jobs(Run Jobs)
```
Command: kedro run --env aws_batch --runner example.runner.AWSBatchRunner
```### Issues
+ Error: ECR registry auth
```
ResourceInitializationError: unable to pull secrets or registry auth:
execution resource retrieval failed: unable to retrieve ecr registry auth:
service call has been retried 1 time(s):
AccessDeniedException: User: arn:aws:sts::783560535431:assumed-rol...ResourceInitializationError: unable to pull secrets or registry auth:
execution resource retrieval failed:
unable to retrieve ecr registry auth: service call has been retried 1 time(s):
RequestError: send request failed caused by: Post https://api.ecr....
```Fixed: add permission ECS for batchJobRole
+ Error: Cloudwatch log stream
```
ResourceInitializationError: failed to validate logger args:
create stream has been retried 1 times: failed to create Cloudwatch log stream:
AccessDeniedException: User: arn:aws:sts::783560535431:assumed-role/batchJobRole/986cca09ac1748c08b77360b92e314...
```Fixed: add permission CloudWatch for batchJobRole
```
ECS-CloudWatchLogs
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_cloudwatch_logs.html{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents",
"logs:DescribeLogStreams"
],
"Resource": [
"arn:aws:logs:*:*:*"
]
}
]
}
```+ Error: SubmitJob
```
botocore.exceptions.ClientError: An error occurred (AccessDeniedException) when
calling the SubmitJob operation:
User: arn:aws:sts::783560535431:assumed-role/batchJobRole/27a6c01f23ec42dfac5d20d539eb48bf
is not authorized to perform:
batch:SubmitJob on resource: arn:aws:batch:ap-southeast-1:783560535431:job-definition/kedro_run
```Fixed: add permission `AWSBatchFullAccess` for batchJobRole
+ BatchJobRole
![BatchJobRole](images/batchJobRole.jpg)### Results
+ Kedro Visualise Pipelines
![Kedro Viz](images/Kedro-Viz.jpg)+ Kedro run Node
![NodeRun](images/kedro-run-node.png)+ Kedro Example Job
![ExampleJob](images/kedro-example-job.jpg)+ Kedro Example Job Log
![ExampleJobLog](images/kedro-example-job-log.jpg)### References
+ [Deployment on AWS Batch](https://kedro.readthedocs.io/en/0.16.6/10_deployment/07_aws_batch.html)
+ [CloudWatch](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_cloudwatch_logs.html)
+ [BotoCore Client Error](https://stackoverflow.com/questions/61976547/botocore-exceptions-clienterror-an-error-occurred-accessdeniedexception)
+ [Getting started with AWS Batch](https://medium0.com/weareservian/getting-started-with-aws-batch-3442446fc62)
+ [Send ECS Container Logs to CloudWatch Logs](https://aws.amazon.com/blogs/devops/send-ecs-container-logs-to-cloudwatch-logs-for-centralized-monitoring/)