https://github.com/landscapegeoinformatics/grqa_src
Scripts used during the creation of the Global River Water Quality Archive (GRQA)
https://github.com/landscapegeoinformatics/grqa_src
hydrology python water-quality
Last synced: 8 months ago
JSON representation
Scripts used during the creation of the Global River Water Quality Archive (GRQA)
- Host: GitHub
- URL: https://github.com/landscapegeoinformatics/grqa_src
- Owner: LandscapeGeoinformatics
- License: mit
- Created: 2021-02-03T09:46:07.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2022-09-07T07:06:50.000Z (about 3 years ago)
- Last Synced: 2024-06-05T19:27:52.621Z (over 1 year ago)
- Topics: hydrology, python, water-quality
- Language: Python
- Homepage:
- Size: 85 KB
- Stars: 4
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# GRQA_src
[](https://doi.org/10.5281/zenodo.5082147)
Scripts used during the creation of the Global River Water Quality Archive (GRQA).
The dataset can be downloaded at
The data description paper is available at
The scripts are divided into two folders. Folder **preprocessing** contains scripts used for preprocessing raw source data into a common structure used for GRQA. Folder **grqa_processing** contains scripts used for processing the merged data, generating plots and statistics.
**preprocessing** contains the following scripts:
* *\*\_download* used for downloading source data
* *\*\_units* for collecting water quality parameter units when multiple units per parameter were present in source data
* *\*\_preprocessing* for source data cleaning and parameter harmonization to convert into a common structure used in GRQA
* *WQP\_merge\_stats* for merging WQP time series statistics**grqa\_preprocessing** contains the following scripts:
* *\*\_param\_codes* for creating a list of GRQA parameters used as an input for the parallel implementation of *\*_obs\_merging*
* *\*\_obs\_merging* used for merging harmonized source data, calculating time series statistics per site (outliers, monthly availability, continuity) and flagging potential duplicate observations
* *\*\_param\_stats* for calculating GRQA time series statistics per parameter
* *\*\_plot\_sites* for creating maps of observation site distribution, monthly availablity, monthly continuity and median value per parameter
* *\*\_plot\_hist* for creating temporal distribution plots, histograms and box plots per parameter
* *\*\_plot\_sites\_grid* for creating maps of observation site distribution, monthly availablity, monthly continuity and median value of DO, DOC, TP and TSS for the paper
* *\*\_plot\_hist\_grid* for creating temporal distribution plots, histograms and box plots of DO, DOC, TP and TSS for the paperEach Python script has a corresponding shell script that was used for submitting Slurm jobs to the HPC cluster of University of Tartu.