https://github.com/podaac/generate_downloader
Generate downloader component that downloads files from OBPG
https://github.com/podaac/generate_downloader
development generate
Last synced: 20 days ago
JSON representation
Generate downloader component that downloads files from OBPG
- Host: GitHub
- URL: https://github.com/podaac/generate_downloader
- Owner: podaac
- License: apache-2.0
- Created: 2023-01-09T15:39:45.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2026-04-12T21:11:00.000Z (2 months ago)
- Last Synced: 2026-04-12T22:25:48.845Z (2 months ago)
- Topics: development, generate
- Language: Python
- Homepage:
- Size: 253 KB
- Stars: 1
- Watchers: 7
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
README
# downloader
The downloader downloads files from the OBPG based on lists created by the download list creator component.
Top-level Generate repo: https://github.com/podaac/generate
## pre-requisites to building
The following variables must be present in the user's environment:
- MACHINE
- LOGIN
- PASSWORD
The script `netrc_env_vars.sh` found in the top-level `generate` directory can be used to export these variables to the user's environment. They are used to create the `.netrc` file used by the OBPG download operations to download data. See this link for more information on the .netrc file: https://podaac.github.io/2022-SWOT-Ocean-Cloud-Workshop/prerequisites/02_NASA_Earthdata_Authentication.html
## build command
`docker build --build-arg MACHINE --build-arg LOGIN --build-arg PASSWORD --tag downloader:0.1 . `
## execute command
Arguments:
1. list_directory
2. job_index
3. processing_level
4. separator_character
5. processing_type
6. top_level_output_directory
7. num_files_to_download
8. sleep_time_in_between_files
9. move_filelist_file_when_done
10. perform_checksum_flag
11. test_run_flag
MODIS A:
`docker run --name gen-test -v /downloader/input:/data/input -v /downloader/logs:/data/logs -v /downloader/output:/data/output -v /downloader/scratch:/data/scratch downloader:0.1 /data/lists 0 L2 SPACE MODIS_A /data/output 5 1 yes yess`
MODIS T:
`docker run --name gen-test -v /downloader/input:/data/input -v /downloader/logs:/data/logs -v /downloader/output:/data/output -v /downloader/scratch:/data/scratch downloader:0.1 /data/lists 0 L2 SPACE MODIS_T /data/output 5 1 yes yes`
VIIRS:
`docker run --name gen-test -v /downloader/input:/data/input -v /downloader/logs:/data/logs -v /downloader/output:/data/output -v /downloader/scratch:/data/scratch downloader:0.1 /data/lists 0 L2 SPACE VIIRS /data/output 5 1 yes yes`
Please note that in order for the commands to execute the `/downloader/` directories will need to point to actual directories on the system.
## aws infrastructure
The downloader includes the following AWS services:
- AWS Batch job definition.
- CloudWatch log group.
- Elastic Container Registry repository.
## terraform
Deploys AWS infrastructure and stores state in an S3 backend using a DynamoDB table for locking.
To deploy:
1. Edit `terraform.tfvars` for environment to deploy to.
2. Edit `terraform_conf/backed-{prefix}.conf` for environment deploy.
3. Initialize terraform: `terraform init -backend-config=terraform_conf/backend-{prefix}.conf`
4. Plan terraform modifications: `terraform plan -out=tfplan`
5. Apply terraform modifications: `terraform apply tfplan`
`{prefix}` is the account or environment name.