{"id":37305433,"url":"https://github.com/ciroh-ua/forcingprocessor","last_synced_at":"2026-04-24T23:06:07.504Z","repository":{"id":314139260,"uuid":"1052799392","full_name":"CIROH-UA/forcingprocessor","owner":"CIROH-UA","description":"ForcingProcessor calculates NextGen catchment-averaged forcings from gridded sources like the National Water Model.","archived":false,"fork":false,"pushed_at":"2026-02-19T18:32:58.000Z","size":19658,"stargazers_count":2,"open_issues_count":16,"forks_count":6,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-02-19T21:25:10.334Z","etag":null,"topics":["atmospheric","aws","boto3","ciroh","cloud","datastream","docker","forcings","geopackage","geospatial-processing","high-performance","multiprocessing","nrds","nwm","precipitation","python"],"latest_commit_sha":null,"homepage":"https://docs.ciroh.org/docs/products/research-datastream/forcingprocessor/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CIROH-UA.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-08T15:04:24.000Z","updated_at":"2026-02-19T18:22:28.000Z","dependencies_parsed_at":"2025-09-10T21:44:17.820Z","dependency_job_id":"69bc2572-cb48-4351-a92e-52041f8d7985","html_url":"https://github.com/CIROH-UA/forcingprocessor","commit_stats":null,"previous_names":["ciroh-ua/forcingprocessor"],"tags_count":9,"template":false,"template_full_name":"AlabamaWaterInstitute/awi-open-source-project-template","purl":"pkg:github/CIROH-UA/forcingprocessor","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CIROH-UA%2Fforcingprocessor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CIROH-UA%2Fforcingprocessor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CIROH-UA%2Fforcingprocessor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CIROH-UA%2Fforcingprocessor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CIROH-UA","download_url":"https://codeload.github.com/CIROH-UA/forcingprocessor/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CIROH-UA%2Fforcingprocessor/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29637400,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-19T22:32:43.237Z","status":"ssl_error","status_checked_at":"2026-02-19T22:32:38.330Z","response_time":117,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["atmospheric","aws","boto3","ciroh","cloud","datastream","docker","forcings","geopackage","geospatial-processing","high-performance","multiprocessing","nrds","nwm","precipitation","python"],"created_at":"2026-01-16T02:54:41.124Z","updated_at":"2026-04-24T23:06:07.492Z","avatar_url":"https://github.com/CIROH-UA.png","language":"Python","readme":"# Forcing Processor\nForcingprocessor converts National Water Model (NWM) forcing data into Next Generation National Water Model (NextGen) forcing data. This tool provides the forcing pre-processing for the [NextGen Research DataStream](https://github.com/CIROH-UA/ngen-datastream).\n\nThe motivation for this tool is NWM data is gridded and stored within netCDFs for each forecast hour. Ngen inputs this same forcing data, but in the format of catchment averaged data time series data.\n\n![forcing_gif](docs/gifs/T2D_2_TMP_2maboveground_cali.gif)\n\n## Install UV\n```\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n```\n## Create a Python Virtual Environment\n```\nuv venv\n```\n## Install Requirements\n```\nuv pip install -r pyproject.toml\n```\n## Create Output Directory\n```\nmkdir -p data/forcing\n```\n## Run the Forcingprocessor\n```\nuv run python src/forcingprocessor/processor.py ./configs/conf_fp.json\n```\nPrior to executing the processor, the user will need to obtain a geopackage file to define the spatial domain. The user will define the time domain by generating the forcing filenames for `processor.py` via `nwm_filenames_generator.py`, which is explained [here](#nwm_file). Note that `forcingprocessor` will calcuate weights if not found within the geopackage file.\n\n### Channel Routing Data Extraction\n\nThis tool can also extract `q_lateral` data from the NWM's channel routing data sources. This is\nhelpful for experimenting with routing simulations. Note that channel routing data extraction\ncannot be run at the same time as forcing data extraction at this time. This use case requires a\nmapping file ending in `map.json` in the format\n```\n{ngen-nex-id-1: [nwm-id-1-1,...],\n.\n.\n.\nngen-nex-id-k: [nwm-id-k-1,...]}\n```\nwhere the list of `nwm-id`s are the NHD reaches associated with that NextGen hydrofabric nexus.\n\n## Example `conf.json`\n```\n{\n    \"forcing\"  : {\n        \"nwm_file\"     : \"\",\n        \"gpkg_file\"    : \"\"\n    },\n\n    \"storage\":{\n        \"output_path\"      : \"\",\n        \"output_file_type\" : []\n    },\n\n    \"run\" : {\n        \"verbose\"       : true,\n        \"collect_stats\" : true,\n        \"nprocs\"        : 2\n    },\n\n    \"plot\":{\n        \"nts\"        : 24,\n        \"ngen_vars\"  : [\n            \"DLWRF_surface\",\n            \"APCP_surface\",\n            \"precip_rate\",\n            \"TMP_2maboveground\"\n        ]\n    }\n}\n```\n\n## `conf.json` Options\n### 1. Forcing\n| Field             | Description              | Required |\n|-------------------|--------------------------|----------|\n| nwm_file          | Path to a text file containing nwm file names. One filename per line. [Tool](#nwm_file) to create this file | :white_check_mark: |\n| gpkg_file       | Geopackage file to define spatial domain. Use [hfsubset](https://github.com/lynker-spatial/hfsubsetCLI) to generate a geopackage with a `forcing-weights` layer. Accepts local absolute path, s3 URI or URL. Also acceptable is a weights parquet generated with [weights_hf2ds.py](https://github.com/CIROH-UA/forcingprocessor/blob/main/src/forcingprocessor/weights_hf2ds.py), though the plotting option will no longer be available. |  :white_check_mark: |\n| map_file          | Path to a json containing the NWM to NGEN mapping for channel routing data extraction. Absolute path or s3 URI |  |\n| restart_map_file          | Path to a json containing the NWM to NGEN catchment mapping for t-route restart generation. Absolute path or s3 URI |  |\n| crosswalk_file          | Path to a netCDF containing the exact order of the catchments in the t-route restart file. Absolute path or s3 URI |  |\n| routelink_file          | Path to a netCDF containing the NWM channel geometry data, needed for t-route restart generation. Absolute path or s3 URI |  |\n\n### 2. Storage\n\n| Field             | Description                       | Required |\n|-------------------|-----------------------------------|----------|\n| storage_type      | Type of storage (local or s3 URI)     | :white_check_mark: |\n| output_path       | Path to write data to. Accepts local path or s3 URI | :white_check_mark: |\n| output_file_type  | List of output file types, e.g. [\"tar\",\"parquet\",\"csv\",\"netcdf\"]  | :white_check_mark: |\n\n### 3. Run\n| Field             | Description                    | Required |\n|-------------------|--------------------------------|----------|\n| verbose           | Get print statements, defaults to false           |  :white_check_mark: |\n| collect_stats     | Collect forcing metadata, defaults to true       |  :white_check_mark: |\n| nprocs      | Number of data processing processes, defaults to 50% available cores |   |\n\n### 4. Plot\nUse this field to create a side-by-side gif of the nwm and ngen forcings\n| Field             | Description                    | Required |\n|-------------------|--------------------------------|----------|\n| nts           | Number of timesteps to include in the gif, default is 10           |   |\n| ngen_vars     | Which ngen forcings variables to create gifs of, default is all of them  |   |`\n```\nngen_variables = [\n    \"UGRD_10maboveground\",\n    \"VGRD_10maboveground\",\n    \"DLWRF_surface\",\n    \"APCP_surface\",\n    \"precip_rate\",\n    \"TMP_2maboveground\",\n    \"SPFH_2maboveground\",\n    \"PRES_surface\",\n    \"DSWRF_surface\",\n]\n```\n\n## nwm_file\nA text file given to forcingprocessor that contains each nwm forcing file name. These can be URLs or local paths. This file can be generated with the [nwmurl tool](https://github.com/CIROH-UA/nwmurl) and a [generator script](https://github.com/CIROH-UA/forcingprocessor/blob/main/src/forcingprocessor/nwm_filenames_generator.py) has been provided within this repo. The config argument accepts an s3 URL.\n ```\n python nwm_filenames_generator.py conf_nwm_files.json\n ```\n An example configuration file:\n ```\n {\n    \"forcing_type\" : \"operational_archive\",\n    \"start_date\"   : \"202310300000\",\n    \"end_date\"     : \"202310300000\",\n    \"runinput\"     : 1,\n    \"varinput\"     : 5,\n    \"geoinput\"     : 1,\n    \"meminput\"     : 0,\n    \"urlbaseinput\" : 7,\n    \"fcst_cycle\"   : [0],\n    \"lead_time\"    : [1]\n}\n ```\n\n## Weights\nTo calculate NextGen forcings, \"weights\" must be calculated to extract polygon averaged data from gridded data. The weights are made up of two parts, the `cell_id` and `coverage`. These are calculated via [exactextract](https://github.com/isciences/exactextract) within [weights_hf2ds.py](https://github.com/CIROH-UA/forcingprocessor/blob/main/src/forcingprocessor/weights_hf2ds.py), which is optionally called from forcingprocessor.\n\nIf a geopackage is supplied to forcingprocessor, it will be searched for the layer `forcings-weights`. If this layer is found, these weights are used during processing. If not, forcingprocessor will call [weights_hf2ds.py](https://github.com/CIROH-UA/forcingprocessor/blob/main/src/forcingprocessor/weights_hf2ds.py) to calculate the weights (cell_id and coverage) for every divide-id in the geopackage. This can take time, so forcingprocessor will write a parquet of weights out in the metadata, that can be reused in future forcingprocessor executions.\n\nExample of direct call\n```\npython3 forcingprocessor/src/forcingprocessor/weights_hf2ds.py \\\n--outname ./weights.parquet \\\n--input_file ./nextgen_VPU_03W.gpkg\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fciroh-ua%2Fforcingprocessor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fciroh-ua%2Fforcingprocessor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fciroh-ua%2Fforcingprocessor/lists"}