https://github.com/podaac/generate_partition_submit
Partitions Generate workflow data and submits AWS Batch jobs.
https://github.com/podaac/generate_partition_submit
development generate
Last synced: 20 days ago
JSON representation
Partitions Generate workflow data and submits AWS Batch jobs.
- Host: GitHub
- URL: https://github.com/podaac/generate_partition_submit
- Owner: podaac
- License: apache-2.0
- Created: 2023-01-23T22:25:31.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2026-04-12T21:11:09.000Z (2 months ago)
- Last Synced: 2026-04-12T22:25:48.948Z (2 months ago)
- Topics: development, generate
- Language: Python
- Size: 169 KB
- Stars: 1
- Watchers: 7
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
README
# parition & submit
Partitions download list creator text files in AWS Batch jobs (which contain available OBPG downloads) and then submits all Generate worfklow jobs to AWS Batch (downloader -> combiner -> processor -> upload).
Partition & Submit also maintains a holding tank of refined SST files. If it detects any unmatched refined SST files it stores them in a JSON file organized by date and hour in the `download-lists` S3 bucket: `holding_tank/{aqua|terra|viirs}`. It then loads any refined SST files older than a specified threshold and submits them as a part of the Generate workflow. The threshold is speficied in the config file.
Top-level Generate repo: https://github.com/podaac/generate
## pre-requisites to building
None.
## build command
`docker build --tag partition_submit:0.1 . `
## execute command
Arguments:
- Taken from event that triggers AWS Lambda (download list creator publishes to SQS queue).
MODIS A:
`docker run --rm --name ps`
MODIS T:
`docker run --rm --name ps`
VIIRS:
`docker run --rm --name ps`
**NOTES**
- In order for the commands to execute the `/?/` directories will need to point to actual directories on the system.
- The job config file is hosted in an AWS S3 Bucket that the component loads into memory to create and submit AWS Batch jobs.
## aws infrastructure
The partition and submit component includes the following AWS services:
- AWS Lambda function.
- AWS S3 bucket to hold input text files and job config file.
- AWS SQS queue to allow the publication of the list of text files.
- AWS SQS queue to track pending AWS jobs from previous runs.
- AWS SSM parameters to track the number of available IDL licenses.
## terraform
Deploys AWS infrastructure and stores state in an S3 backend using a DynamoDB table for locking.
To deploy:
1. Edit `terraform.tfvars` for environment to deploy to.
2. Edit `terraform_conf/backed-{prefix}.conf` for environment deploy.
3. Initialize terraform: `terraform init -backend-config=terraform_conf/backend-{prefix}.conf`
4. Plan terraform modifications: `terraform plan -out=tfplan`
5. Apply terraform modifications: `terraform apply tfplan`
`{prefix}` is the account or environment name.