{"id":16704122,"url":"https://github.com/rabutler-usbr/knnstdisagg","last_synced_at":"2025-10-11T01:39:13.959Z","repository":{"id":193109205,"uuid":"142495524","full_name":"rabutler-usbr/knnstdisagg","owner":"rabutler-usbr","description":"Nonparametric space-time streamflow disaggregation using knn","archived":false,"fork":false,"pushed_at":"2023-09-07T23:31:34.000Z","size":15241,"stargazers_count":1,"open_issues_count":7,"forks_count":2,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-07-18T20:07:00.391Z","etag":null,"topics":["disaggregation","hydrology","knn","r"],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rabutler-usbr.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2018-07-26T21:25:07.000Z","updated_at":"2024-09-25T06:51:10.000Z","dependencies_parsed_at":"2023-09-24T10:14:58.233Z","dependency_job_id":null,"html_url":"https://github.com/rabutler-usbr/knnstdisagg","commit_stats":{"total_commits":131,"total_committers":1,"mean_commits":131.0,"dds":0.0,"last_synced_commit":"b631d7e7b85a1eccaa122e175a16bb8fb1475612"},"previous_names":["rabutler-usbr/knnstdisagg"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/rabutler-usbr/knnstdisagg","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rabutler-usbr%2Fknnstdisagg","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rabutler-usbr%2Fknnstdisagg/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rabutler-usbr%2Fknnstdisagg/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rabutler-usbr%2Fknnstdisagg/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rabutler-usbr","download_url":"https://codeload.github.com/rabutler-usbr/knnstdisagg/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rabutler-usbr%2Fknnstdisagg/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279005929,"owners_count":26083986,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-10T02:00:06.843Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["disaggregation","hydrology","knn","r"],"created_at":"2024-10-12T19:11:12.649Z","updated_at":"2025-10-11T01:39:13.938Z","avatar_url":"https://github.com/rabutler-usbr.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\"\n)\n```\n\n# knnstdisagg\n\nAn R package to perform space and time disaggregation of streamflow using a K-nearest neighbor (knn) approach. \n\n[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active) [![R build status](https://github.com/rabutler-usbr/knnstdisagg/workflows/R-CMD-check/badge.svg)](https://github.com/rabutler-usbr/knnstdisagg/actions) [![Codecov test coverage](https://codecov.io/gh/rabutler-usbr/knnstdisagg/branch/master/graph/badge.svg)](https://codecov.io/gh/rabutler-usbr/knnstdisagg?branch=master)\n\n## Installation \n\nCurrently only available on GitHub\n\n```{r, eval = FALSE}\n# install.packages(\"remotes\")\nremotes::install_github(\"rabutler-usbr/knnstdisagg\")\n```\n\n## Example Usage\n\nOne application of the KNN space-time disaggregation methodology is to take paleo reconstructed data at Lees Ferry in the Colorado River Basin, and disaggregate those annual flows at one location to monthly flows at 29 locations. The following steps through how to use the knnstdisagg package to do so. \n\nAs the space-time disaggregation method relies on monthly pattern data, we need monthly pattern data, and we will use the [CoRiverNF](https://github.com/BoulderCodeHub/CoRiverNF) package for those data. \n\n```{r, eval = FALSE}\nremotes::install_github(\"BoulderCodeHub/CoRiverNF\")\n```\n\n### Setup the data\n\nThe space-time disaggregation disaggregates an annual value (`ann_flow`) to monthly data by matching `ann_flow` to an annual index value (`ann_index_flow`). Then, the monthly pattern and spatial pattern (`mon_flow`) from the selected annual index year is used to disaggregate the data. \n\nIn this example, we will be disaggregating the Meko et al. (2007) paleo reconstructed data (`meko`), which is provided as an example dataset in this package:\n\n```{r}\nlibrary(knnstdisagg)\nlibrary(CoRiverNF)\n\nhead(meko)\n```\n\n`meko` is already formatted correctly for use in this package: one column of years and one column of annual data.\n\nThe annual index flow and monthly values are from the CoRiverNF package. We will match the `meko` data to the historical water year data at Lees Ferry. Additionally, for now, the index data needs to be a two column matrix and not an xts object. For the monthly data, we need it to be full water years, so need to remove the last three months of data so it stops at the end of the last water year.\n\n```{r}\n# setup annual data\nannual_index \u003c- CoRiverNF::wyAnnTot$LeesFerry\nyrs \u003c- as.numeric(format(index(wyAnnTot$LeesFerry), \"%Y\"))\nannual_index \u003c- as.matrix(annual_index)\nannual_index \u003c- cbind(yrs, annual_index)\n\n# setup monthly data\nlast_month \u003c- paste0(\"/\", max(yrs), \"-09\")\nmonthly_data \u003c- CoRiverNF::monthlyInt[last_month]\n```\n\n*Note, this example uses a named xts object for the monthly data, which means the results are also a named xts object. An unnamed matrix will also work.*\n\n### Space-time disaggregation\n\nThe space-time disaggregation is performed by `knn_space_time_disagg()`. We have already setup the data necesary for `ann_flow`, `ann_index_flow`, and `mon_flow`. Because this is water year data, we will set the `start_month` to 10 as the water year starts in October. We will only disaggregate the data once, so there is only one \"simulation\". In previous work, we have found that we want to scale the Upper Basin nodes based on the volume at Lees Ferry, but in the Lower Basin, we will not scale their values, i.e., we will select the monthly data directly for the selected index year. The Upper Basin sites are sites 1-20. Finally, we will use the default weighting scheme from Nowak et al. to select the nearest neighbor. \n\n```{r}\ndisagg \u003c- knn_space_time_disagg(\n  ann_flow = meko,\n  ann_index_flow = annual_index,\n  mon_flow = monthly_data,\n  start_month = 10,\n  nsim = 1,\n  scale_sites = 1:20,\n  k_weights = knn_params_default(nrow(annual_index))\n)\n```\n\nThe results are now in `disagg`, and we can get the output using `knnst_get_disagg_data()`:\n\n```{r}\nhead(knnst_get_disagg_data(disagg)[,5:10]) # only look at a few sites\n```\n\nIf needed, the output can be saved to disk using `write_knnst()`. This saves the disaggregated data for every simulation as well as the selected index years. \n\n### QA/QC\n\n#### Base Statistics\n\nThe knnstdisagg package also includes plotting functionality to assist with QA/QC. Plots of monthly statistics (mean, max, min, variance, lag-1 correlation, and skew), annual statistics (same as monthly), annual cdf, and a monthly cdf for each month can be created using `plot()`. Each call to plot works for one site. A `bin_size` must be specified; this is the moving window that all statistics on the disaggregated data are computed accross. Looking at Glenwood Springs monthly statistics:\n\n```{r}\np \u003c- plot(\n  disagg, \n  site = \"GlenwoodSprings\", \n  base_units = \"acre-feet\", \n  which = 14, \n  show = TRUE,\n  bin_size = 50\n)\n```\n\n```{r, echo=FALSE}\np[[\"monthly-stats\"]]\n```\n\n*If an unnamed matrix was used for input, then the sites are accessed by `\"S1\"`, `\"S2\"`, etc. during plotting.*\n\nWe can also look at the annual cdf for Bluff, or the May cdf for Maybell:\n\n```{r}\np \u003c- plot(\n  disagg, \n  site = \"Bluff\", \n  base_units = \"acre-feet\", \n  which = 13, \n  show = TRUE,\n  bin_size = 50\n)\n```\n\n```{r, echo=FALSE}\np[[\"annual-cdf\"]]\n```\n\n```{r}\np \u003c- plot(\n  disagg, \n  site = \"Maybell\", \n  base_units = \"acre-feet\", \n  which = 5, \n  show = TRUE,\n  bin_size = 50\n)\n```\n\n```{r, echo=FALSE}\np[[\"May-cdf\"]]\n```\n\nAll plots for a given site can be created at once using `which = 1:15`, and the suite of plots can be saved using `save_knnstplot()`. For example, all plots for the data at Cameo could be saved to a pdf using: \n\n```{r, eval=FALSE}\np \u003c- plot(\n  disagg, \n  site = \"Cameo\", \n  base_units = \"acre-feet\", \n  which = 1:15, \n  show = FALSE,\n  bin_size = 50\n)\nsave_knnstplot(p, \"Cameo.pdf\", width = 8, height = 6)\n```\n\n#### Spatial Correlation\n\nAnother statistic to check is the spatial correlation between sites. This is computed using `knnst_spatial_cor()`. For each specified site, the correlation with all other sites is computed, and then it can be easily plotted. To get the correlation from Cameo and Hoover to all other sites:\n\n```{r}\nsp_cor \u003c- knnst_spatial_cor(disagg, sites = c(\"Cameo\", \"Hoover\"), bin_size = 50)\nplot(sp_cor)\n```\n\n#### Temporal Correlation\n\nA final statistic to compare is the monthly cross correlation. This is computed using `knnst_temporal_cor()` and is only computed for one site at a time. To compute the monthly cross correlation at Greendale:\n\n```{r}\ntmp_cor \u003c- knnst_temporal_cor(disagg, site = \"Greendale\", bin_size = 50)\nplot(tmp_cor)\n```\n\n### Recreating Disaggregation\n\n**Will be added to vignette**\n\n## Acknowledgements\n\nThis package implements the methods developed by Nowak et al. (2010): \n\nNowak, K., J. Prairie, B. Rajagopalan, and U. Lall (2010), A nonparametric stochastic approach for multisite disaggregation of annual to daily streamflow, Water Resour. Res., 46, W08529, [doi:10.1029/2009WR008530](https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2009WR008530).\n\nIt also uses the Meko et al. (2007) paleo reconstructed Lees Ferry natural flow as example data in the package and above:\n\nMeko, D.M., Woodhouse, C.A., Baisan, C.A., Knight, T., Lukas, J.J., Hughes, M.K., and Salzer, M.W. 2007. Medieval Drought in the Upper Colorado River Basin. Geophysical Research Letters 34, L10705.\n\n## Disclaimer\n\nThis software is in the public domain because it contains materials that originally came from the U.S. Bureau of Reclamation, an agency of the United States Department of Interior.\n\nAlthough this code has been used by Reclamation, no warranty, expressed or implied, is made by Reclamation or the U.S. Government as to the accuracy and functioning of the program and related program material nor shall the fact of distribution constitute any such warranty, and no responsibility is assumed by Reclamation in connection therewith.\n\nThis software is provided \"AS IS.\"\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frabutler-usbr%2Fknnstdisagg","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frabutler-usbr%2Fknnstdisagg","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frabutler-usbr%2Fknnstdisagg/lists"}