https://github.com/erdc/spt_compute
Runs streamflow forecasts using ECMWF predicted runoff and RAPID (Forked from: https://github.com/CI-WATER/erfp_data_process_ubuntu_aws).
https://github.com/erdc/spt_compute
ecmwf forecast-climate forecasting forecasting-models hydraulics hydrology rapid
Last synced: 6 months ago
JSON representation
Runs streamflow forecasts using ECMWF predicted runoff and RAPID (Forked from: https://github.com/CI-WATER/erfp_data_process_ubuntu_aws).
- Host: GitHub
- URL: https://github.com/erdc/spt_compute
- Owner: erdc
- License: bsd-3-clause
- Created: 2015-08-21T16:07:30.000Z (almost 11 years ago)
- Default Branch: master
- Last Pushed: 2024-03-13T21:16:46.000Z (about 2 years ago)
- Last Synced: 2024-05-02T04:00:56.826Z (about 2 years ago)
- Topics: ecmwf, forecast-climate, forecasting, forecasting-models, hydraulics, hydrology, rapid
- Language: Python
- Homepage:
- Size: 23.7 MB
- Stars: 8
- Watchers: 21
- Forks: 16
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# spt_compute
(Previously spt_ecmwf_autorapid_process)
Computational framework to ingest ECMWF ensemble runoff forcasts or other Land Surface Model input; generate input for and run the RAPID (rapid-hub.org) program using HTCondor or Python's Multiprocessing; and upload to CKAN in order to be used by the Streamflow Prediction Tool (SPT). There is also an experimental option to use the AutoRoute program for flood inundation mapping.
[](https://github.com/erdc/spt_compute/blob/master/LICENSE)
[](https://travis-ci.org/erdc/spt_compute)
[](https://zenodo.org/badge/latestdoi/19918/erdc/spt_ecmwf_autorapid_process)
## How it works:
Snow, Alan D., Scott D. Christensen, Nathan R. Swain, E. James Nelson, Daniel P. Ames, Norman L. Jones,
Deng Ding, Nawajish S. Noman, Cedric H. David, Florian Pappenberger, and Ervin Zsoter, 2016. A High-Resolution
National-Scale Hydrologic Forecast System from a Global Ensemble Land Surface Model. *Journal of the
American Water Resources Association (JAWRA)* 1-15, DOI: 10.1111/1752-1688.12434
Snow, Alan Dee, "A New Global Forecasting Model to Produce High-Resolution Stream Forecasts" (2015). All Theses and Dissertations. Paper 5272. http://scholarsarchive.byu.edu/etd/5272
# Installation
## Step 1: Install RAPID and RAPIDpy
See: https://github.com/erdc/RAPIDpy
## Step 2: Install HTCondor (if not using Amazon Web Services and StarCluster or not using Multiprocessing mode)
### On Ubuntu
```
apt-get install -y libvirt0 libdate-manip-perl vim
wget http://ciwckan.chpc.utah.edu/dataset/be272798-f2a7-4b27-9dc8-4a131f0bb3f0/resource/86aa16c9-0575-44f7-a143-a050cd72f4c8/download/condor8.2.8312769ubuntu14.04amd64.deb
dpkg -i condor8.2.8312769ubuntu14.04amd64.deb
```
### On RedHat/CentOS 7
See: https://research.cs.wisc.edu/htcondor/yum/
### After Installation:
```
#if master node uncomment CONDOR_HOST and comment out CONDOR_HOST and DAEMON_LIST lines
#echo CONDOR_HOST = \$\(IP_ADDRESS\) >> /etc/condor/condor_config.local
echo CONDOR_HOST = 10.8.123.71 >> /etc/condor/condor_config.local
echo DAEMON_LIST = MASTER, SCHEDD, STARTD >> /etc/condor/condor_config.local
echo ALLOW_ADMINISTRATOR = \$\(CONDOR_HOST\), 10.8.123.* >> /etc/condor/condor_config.local
echo ALLOW_OWNER = \$\(FULL_HOSTNAME\), \$\(ALLOW_ADMINISTRATOR\), \$\(CONDOR_HOST\), 10.8.123.* >> /etc/condor/condor_config.local
echo ALLOW_READ = \$\(FULL_HOSTNAME\), \$\(CONDOR_HOST\), 10.8.123.* >> /etc/condor/condor_config.local
echo ALLOW_WRITE = \$\(FULL_HOSTNAME\), \$\(CONDOR_HOST\), 10.8.123.* >> /etc/condor/condor_config.local
echo START = True >> /etc/condor/condor_config.local
echo SUSPEND = False >> /etc/condor/condor_config.local
echo CONTINUE = True >> /etc/condor/condor_config.local
echo PREEMPT = False >> /etc/condor/condor_config.local
echo KILL = False >> /etc/condor/condor_config.local
echo WANT_SUSPEND = False >> /etc/condor/condor_config.local
echo WANT_VACATE = False >> /etc/condor/condor_config.local
```
NOTE: if you forgot to change lines for master node, change CONDOR_HOST = $(IP_ADDRESS)
and restart condor as ROOT
If Ubuntu:
```
# . /etc/init.d/condor stop
# . /etc/init.d/condor start
```
If RedHat:
```
# systemctl stop condor
# systemctl start condor
```
## Step 3: Install Prerequisite Packages
### On Ubuntu:
```
$ apt-get install libssl-dev libffi-dev
$ sudo su
$ pip install requests_toolbelt tethys_dataset_services condorpy
$ exit
```
### On RedHat/CentOS 7:
```
$ yum install libffi-devel openssl-devel
$ sudo su
$ pip install requests_toolbelt tethys_dataset_services condorpy
$ exit
```
If you are on RHEL 7 and having troubles, add the epel repo:
```
$ wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
$ sudo rpm -Uvh epel-release-7*.rpm
```
If you are on CentOS 7 and having troubles, add the epel repo:
```
$ sudo yum install epel-release
```
Then install packages listed above.
## Step 4: (Optional) Install AutoRoute and AutoRoutePy
If you want to try out the forecasted AutoRoute flood inundation (BETA), you will need to complete this section.
Follow the instructions here: https://github.com/erdc/AutoRoutePy
## Step 5: Install Submodule Dependencies
See: https://github.com/erdc/spt_dataset_manager
## Step 6: Download and install the source code
```
$ cd /path/to/your/scripts/
$ git clone https://github.com/erdc/spt_ecmwf_autorapid_process.git
$ cd spt_ecmwf_autorapid_process
$ python setup.py install
```
## Step 7: Create folders for RAPID input and for downloading ECMWF
```
$ cd /your/working/directory
$ mkdir -p rapid-io/input rapid-io/output ecmwf logs subprocess_logs era_interim_watershed mp_execute
```
## Step 8: Change the locations in the files
Create a file *run_ecmwf_rapid.py* and change these variables for your instance. See below for different configurations.
```python
# -*- coding: utf-8 -*-
from spt_compute import run_ecmwf_forecast_process
#------------------------------------------------------------------------------
#main process
#------------------------------------------------------------------------------
if __name__ == "__main__":
run_ecmwf_forecast_process(
rapid_executable_location='/home/alan/scripts/rapid/src/rapid',
rapid_io_files_location='/home/alan/rapid-io',
ecmwf_forecast_location ="/home/alan/ecmwf",
era_interim_data_location="/home/alan/era_interim_watershed",
subprocess_log_directory='/home/alan/subprocess_logs',
main_log_directory='/home/alan/logs',
data_store_url='http://your-ckan/api/3/action',
data_store_api_key='your-ckan-api-key',
data_store_owner_org="your-organization",
app_instance_id='your-streamflow_prediction_tool-app-id',
#sync_rapid_input_with_ckan=False,
download_ecmwf=True,
ftp_host="ftp.ecmwf.int",
ftp_login="",
ftp_passwd="",
ftp_directory="",
upload_output_to_ckan=True,
initialize_flows=True,
create_warning_points=True,
delete_output_when_done=True,
mp_mode='htcondor',
#mp_execute_directory='',
)
```
### run_ecmwf_rapid_process Function Variables
|Variable|Data Type|Description|Default|
|---|:---:|---|:---:|
|*rapid_executable_location*|String|Path to RAPID executable.||
|*rapid_io_files_location*|String|Path to RAPID input/output directory.||
|*ecmwf_forecast_location*|String|Path to ECMWF forecasts.||
|*main_log_directory*|String|Path to store HTCondor/multiprocess logs.||
|*data_store_url*|String|(Optional) CKAN API url (e.g. http://your-ckan/api/3/action)|""|
|*data_store_api_key*|String|(Optional) CKAN API Key (e.g. abcd-1234-defr-3345)|""|
|*data_store_owner_org*|String|(Optional) CKAN owner organization (e.g. erdc).|""|
|*app_instance_id*|String|(Optional) Streamflow Prediction tool instance ID. |""|
|*sync_rapid_input_with_ckan*|Boolean|(Optional) If set to true, this will download ECMWF-RAPID input cooresponding to your instance of the Streamflow Prediction Tool. |False|
|*download_ecmwf*|Boolean|(Optional) If set to true, this will download the most recent ECMWF forecasts for today before runnning the process. |True|
|*date_string*|String|(Optional) This string will be used to modify the date of the forecasts downloaded and/or the forecasts ran. It is in the format yyyymmdd (e.g. 20160808). |None|
|*ftp_host*|String|(Optional) ECMWF ftp site path (e.g. ftp.ecmwf.int). |""|
|*ftp_login*|String|(Optional) ECMWF ftp login name. |""|
|*ftp_passwd*|String|(Optional) ECMWF ftp password. |""|
|*ftp_directory*|String|(Optional) ECMWF ftp directory. |""|
|*delete_past_ecmwf_forecasts*|Boolean|(Optional) If True, it deletes all past forecasts before the next download. |True|
|*upload_output_to_ckan*|Boolean|(Optional) If true, this will upload the output to CKAN for the Streamflow Prediction Tool to download. |False|
|*delete_output_when_done*|String|(Optional) If true, all output will be deleted when the process completes. It is used when using operationally with *upload_output_to_ckan* set to true. |False|
|*initialize_flows*|String|(Optional) If true, this will initialize flows from all avaialble methods (e.g. Past forecasts, historical data, streamgage data). |False|
|*warning_flows_threshold*|Float|(Optional) Minimum value for return period in m3/s to generate warning. |10|
|*era_interim_data_location*|String|(Optional) Path to ERA Interim based historical streamflow, return period data, and seasonal average data. |""|
|*create_warning_points*|Boolean|(Optional) Generate waring points for Streamflow Prediction Tool. This requires return period data to be located in the *era_interim_data_location*. |False|
|*autoroute_executable_location*|String|(Optional/Beta) Path to AutoRoute executable. |""|
|*autoroute_io_files_location*|String|(Optional/Beta) Path to AutoRoute input/output directory. |""|
|*geoserver_url*|String|(Optional/Beta) Url to API endpoint ending in geoserver/rest. |""|
|*geoserver_username*|String|(Optional/Beta) Username for geoserver. |""|
|*geoserver_password*|String|(Optional/Beta) Password for geoserver. |""|
|*mp_mode*|String|(Optional) This defines how the process is run (HTCondor or Python's Multiprocessing). Valid options are htcondor and multiprocess. |htcondor|
|*mp_execute_directory*|String|(Optional/Required if using multiprocess mode) Directory used in multiprocessing mode to temporarily store files begin generated. |""|
### Possible run configurations
There are many different configurations. Here are some examples.
#### Mode 1: Run ECMWF-RAPID for Streamflow Prediction Tool using HTCondor to run and CKAN to upload
```python
run_ecmwf_forecast_process(
rapid_executable_location='/home/alan/scripts/rapid/src/rapid',
rapid_io_files_location='/home/alan/rapid-io',
ecmwf_forecast_location ="/home/alan/ecmwf",
era_interim_data_location="/home/alan/era_interim_watershed",
subprocess_log_directory='/home/alan/subprocess_logs',
main_log_directory='/home/alan/logs',
data_store_url='http://your-ckan/api/3/action',
data_store_api_key='your-ckan-api-key',
data_store_owner_org="your-organization",
app_instance_id='your-streamflow_prediction_tool-app-id',
download_ecmwf=True,
ftp_host="ftp.ecmwf.int",
ftp_login="",
ftp_passwd="",
ftp_directory="",
upload_output_to_ckan=True,
initialize_flows=True,
create_warning_points=True,
delete_output_when_done=True,
)
```
#### Mode 2: Run ECMWF-RAPID for Streamflow Prediction Tool using HTCondor to run and CKAN to upload & to download model files
```python
run_ecmwf_forecast_process(
rapid_executable_location='/home/alan/scripts/rapid/src/rapid',
rapid_io_files_location='/home/alan/rapid-io',
ecmwf_forecast_location ="/home/alan/ecmwf",
era_interim_data_location="/home/alan/era_interim_watershed",
subprocess_log_directory='/home/alan/subprocess_logs',
main_log_directory='/home/alan/logs',
data_store_url='http://your-ckan/api/3/action',
data_store_api_key='your-ckan-api-key',
data_store_owner_org="your-organization",
app_instance_id='your-streamflow_prediction_tool-app-id',
sync_rapid_input_with_ckan=True,
download_ecmwf=True,
ftp_host="ftp.ecmwf.int",
ftp_login="",
ftp_passwd="",
ftp_directory="",
upload_output_to_ckan=True,
initialize_flows=True,
create_warning_points=True,
delete_output_when_done=True,
)
```
#### Mode 3: Run ECMWF-RAPID for Streamflow Prediction Tool using Multiprocessing to run and CKAN to upload
```python
run_ecmwf_forecast_process(
rapid_executable_location='/home/alan/scripts/rapid/src/rapid',
rapid_io_files_location='/home/alan/rapid-io',
ecmwf_forecast_location ="/home/alan/ecmwf",
era_interim_data_location="/home/alan/era_interim_watershed",
subprocess_log_directory='/home/alan/subprocess_logs',
main_log_directory='/home/alan/logs',
data_store_url='http://your-ckan/api/3/action',
data_store_api_key='your-ckan-api-key',
data_store_owner_org="your-organization",
app_instance_id='your-streamflow_prediction_tool-app-id',
download_ecmwf=True,
ftp_host="ftp.ecmwf.int",
ftp_login="",
ftp_passwd="",
ftp_directory="",
upload_output_to_ckan=True,
initialize_flows=True,
create_warning_points=True,
delete_output_when_done=True,
mp_mode='multiprocess',
mp_execute_directory='/home/alan/mp_execute',
)
```
#### Mode 4: (BETA) Run ECMWF-RAPID for Streamflow Prediction Tool with AutoRoute using Multiprocessing to run
Note that in this example, CKAN was not used. However, you can still add CKAN back in to this example with the parameters shown in the previous examples.
```python
run_ecmwf_forecast_process(
rapid_executable_location='/home/alan/rapid/src/rapid',
rapid_io_files_location='/home/alan/rapid-io',
ecmwf_forecast_location ="/home/alan/ecmwf",
era_interim_data_location="/home/alan/era_interim_watershed",
subprocess_log_directory='/home/alan/subprocess_logs', #path to store HTCondor/multiprocess logs
main_log_directory='/home/alan/logs',
download_ecmwf=True,
ftp_host="ftp.ecmwf.int",
ftp_login="",
ftp_passwd="",
ftp_directory="",
upload_output_to_ckan=True,
initialize_flows=True,
create_warning_points=True,
delete_output_when_done=False,
autoroute_executable_location='/home/alan/scripts/AutoRoute/src/autoroute',
autoroute_io_files_location='/home/alan/autoroute-io',
geoserver_url='http://localhost:8181/geoserver/rest',
geoserver_username='admin',
geoserver_password='password',
mp_mode='multiprocess',
mp_execute_directory='/home/alan/mp_execute',
)
```
## Step 9: Make sure permissions are correct for these files and any directories the script will use
Example:
```
$ chmod u+x run_ecmwf_rapid.py
```
## Step 10: Add RAPID files to the rapid-io/input directory
To generate these files see: https://github.com/erdc/RAPIDpy/wiki/GIS-Tools. If you are using the *sync_rapid_input_with_ckan* option, then you would upload these files through the Streamflow Prediction Tool web interface and this step is unnecessary.
Make sure the directory is in the format [watershed_name]-[subbasin_name]
with lowercase letters, numbers, and underscores only. No spaces!
Example:
```
$ ls /rapid/input
nfie_texas_gulf_region-huc_2_12
$ ls /rapid/input/nfie_texas_gulf_region-huc_2_12
comid_lat_lon_z.csv
k.csv
rapid_connect.csv
riv_bas_id.csv
weight_ecmwf_t1279.csv
weight_ecmwf_tco639.csv
x.csv
```
## Step 11: Create CRON job to run the scripts hourly
To run this automatically, it is necessary to generate cron jobs to run the script. There are many ways to do this and two are presented here.
### Method 1: In terminal using crontab command
```
$ crontab -e
```
Then add:
```
@hourly /usr/bin/env python /path/to/run_ecmwf_rapid.py # ECMWF RAPID PROCESS
```
### Method 2: Use *create_cron.py* to create the CRON jobs:
1) Install crontab Python package.
```
$ pip install python-crontab
```
2) Create and run a script to initialize cron job *create_cron.py*.
```python
from spt_compute.setup import create_cron
create_cron(execute_command='/usr/bin/env python /path/to/run_ecmwf_rapid.py')
```
## Step 12: Create CRON job to release lock on script
If the server is killed in the middle of a process, the lock with persist.
To prevent this, add a cron job to release the lock on bootup.
### Create Script
Create a script to reset the lock info file. Example path: /path/to/ecmwf_rapid_server_reset.py
Then, change the path to the lock info file. To do this, add *spt_compute_ecmwf_run_info_lock.txt*
to your *main_log_directory* from the *run_ecmwf_rapid.py* script.
```python
#! /usr/bin/env python
from spt_compute import reset_lock_info_file
if __name__ == "__main__":
LOCK_INFO_FILE = '/logs/spt_compute_ecmwf_run_info_lock.txt'
reset_lock_info_file(LOCK_INFO_FILE)
```
### Create Cron Job
```
$ crontab -e
```
Then add:
```
@reboot /usr/bin/env python /path/to/ecmwf_rapid_server_reset.py # RESET ECMWF RAPID PROCESS LOCK
```
# Troubleshooting
If you see this error:
ImportError: No module named packages.urllib3.poolmanager
```
$ pip install pip --upgrade
```
Restart your terminal
```
$ pip install requests --upgrade
```