{"id":50467552,"url":"https://github.com/podaac/generate","last_synced_at":"2026-06-01T08:02:49.326Z","repository":{"id":103454382,"uuid":"482980054","full_name":"podaac/generate","owner":"podaac","description":"generates L2P datasets","archived":false,"fork":false,"pushed_at":"2026-04-12T21:08:36.000Z","size":143660,"stargazers_count":1,"open_issues_count":8,"forks_count":0,"subscribers_count":6,"default_branch":"main","last_synced_at":"2026-04-12T22:25:30.982Z","etag":null,"topics":["development","generate"],"latest_commit_sha":null,"homepage":null,"language":"HCL","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/podaac.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2022-04-18T20:00:23.000Z","updated_at":"2026-04-12T21:08:40.000Z","dependencies_parsed_at":"2023-07-08T12:15:25.960Z","dependency_job_id":"0d20e130-574c-4b69-afeb-2c5db099b110","html_url":"https://github.com/podaac/generate","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/podaac/generate","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/podaac%2Fgenerate","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/podaac%2Fgenerate/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/podaac%2Fgenerate/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/podaac%2Fgenerate/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/podaac","download_url":"https://codeload.github.com/podaac/generate/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/podaac%2Fgenerate/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33765379,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-01T02:00:06.963Z","response_time":115,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["development","generate"],"created_at":"2026-06-01T08:02:46.163Z","updated_at":"2026-06-01T08:02:49.272Z","avatar_url":"https://github.com/podaac.png","language":"HCL","funding_links":[],"categories":[],"sub_categories":[],"readme":"# generate\n\nGenerate is a program that downloads data from the Ocean Biology Processing Group (OBPG). Generate processes the data is downloads to create three Level 2P datasets.\n\nGenerate downloads the following data:\n- MODIS Aqua: https://oceancolor.gsfc.nasa.gov/about/missions/aqua/\n- MODIS Terra: https://oceancolor.gsfc.nasa.gov/about/missions/terra/\n- VIIRS: https://oceancolor.gsfc.nasa.gov/about/missions/snpp/\n- JPSS1: https://oceancolor.gsfc.nasa.gov/about/missions/noaa20/\n\nThe API for searching and downloading data can be found here: https://oceancolor.gsfc.nasa.gov/data/download_methods/#api\n\nGenerate outputs the following data:\n- MODIS_A-JPL-L2P-v2019.0: https://podaac.jpl.nasa.gov/dataset/MODIS_A-JPL-L2P-v2019.0\n- MODIS_T-JPL-L2P-v2019.0: https://podaac.jpl.nasa.gov/dataset/MODIS_T-JPL-L2P-v2019.0\n- VIIRS_NPP-JPL-L2P-v2016.2: https://podaac.jpl.nasa.gov/dataset/VIIRS_NPP-JPL-L2P-v2016.2\n- VIIRS_JPSS1-JPL-L2P-v2024.0: https://podaac.jpl.nasa.gov/dataset/VIIRS_JPSS1-JPL-L2P-v2024.0\n\n## components\n\n![Generate Component Data Flow Diagram](diagrams/generate-data-flow.png)\n\nGenerate consists of several components:\n- download list creator: Creates list of files to download (search and download from OBPG).\n- partition and submit: Partitions downloads into jobs and submits the Generate workflow as AWS Batch jobs.\n- downloader: Downloads files from lists created by the download list creator.\n- combiner: Combines downloaded files into a single NetCDF file.\n- processor: Processes combined files into final L2P granule NetCDF file.\n- uploader: Uploads final L2P granules to an S3 bucket and kick offs archive ingestion.\n- cnm_responder: Processes CNM messages (responses) published to a SNS Topic.\n- token_creator: Periodically creates or renews the EDL bearer token required to preform CMR queries.\n- license returner: Returns IDL licenses that were used in the current execution of the Generate workflow.\n- error_handler: Handles AWS Batch job failures by logging and notification.\n- error_checker: Checks for any files that have been quarantined and restarts the Generate worfklow for those files.\n- reporter: Generates daily reports on the number of L2P granules that were processed for MODIS Aqua, MODIS Terra, and VIIRS.\n- purger: Deletes files from the EFS mount archive, downloader, combiner, and processor components that are older than a specific threshold.\n\nComponent repo links:\n- download list creator: https://github.com/podaac/generate_download_list_creator\n- parition and submit: https://github.com/podaac/generate_partition_submit\n- downloader: https://github.com/podaac/generate_downloader\n- combiner: https://github.com/podaac/generate_combiner\n- processor: https://github.com/podaac/generate_processor\n- uploader: https://github.com/podaac/generate_uploader\n- cnm_responder: https://github.com/podaac/generate_cnm_responder\n- token_creator: https://github.com/podaac/generate-token-creator\n- license returner: https://github.com/podaac/generate_license_returner\n- error_handler: https://github.com/podaac/generate_error_handler\n- error_checker: https://github.com/podaac/generate_error_checker\n- reporter: https://github.com/podaac/generate_reporter\n- purger: https://github.com/podaac/generate_purger\n\n## aws infrastructure\n\nThe Generate workflow includes the following AWS services:\n- AWS Batch compute environment with launch template and user-data script, job queue, and scheduling policy for each dataset.\n- Elastic file system for the following components: downloader, combiner, processor.\n- IAM roles and policies for Batch and ECS permissions.\n- S3 bucket to hold final L2P output.\n- Security groups to support EFS network traffic in VPC.\n\n## terraform \n\nDeploys AWS infrastructure and stores state in an S3 backend using a DynamoDB table for locking. The top-level `terraform` directory contains AWS infrastructure that applies to all components. Each component may have additional terraform files for deploying AWS resources, see each components `README.md` for details.\n\nTo deploy:\n1. Edit `terraform.tfvars` for environment to deploy to.\n2. Edit `terraform_conf/backed-{prefix}.conf` for environment deploy.\n3. Initialize terraform: `terraform init -backend-config=terraform_conf/backend-{prefix}.conf`\n4. Plan terraform modifications: `terraform plan -out=tfplan`\n5. Apply terraform modifications: `terraform apply tfplan`\n\n`{prefix}` is the account or environment name.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpodaac%2Fgenerate","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpodaac%2Fgenerate","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpodaac%2Fgenerate/lists"}