Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/jsta/nhdr

R interface to the National Hydrography Dataset :droplet:
https://github.com/jsta/nhdr

cran geospatial national-hydrography-dataset nhd rstats water-quality water-resources

Last synced: about 1 month ago
JSON representation

R interface to the National Hydrography Dataset :droplet:

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-"
)
```

[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
[![CRAN\_Status\_Badge](http://www.r-pkg.org/badges/version/nhdR)](https://cran.r-project.org/package=nhdR)
[![R-CMD-check](https://github.com/jsta/nhdR/actions/workflows/R-CMD-check.yml/badge.svg)](https://github.com/jsta/nhdR/actions/workflows/R-CMD-check.yml)
[![DOI](https://zenodo.org/badge/75339263.svg)](https://zenodo.org/badge/latestdoi/75339263)

# nhdR

Tools for querying, downloading, and networking both the [National Hydrography Dataset (NHD)](https://www.usgs.gov/national-hydrography) and [NHDPlus](https://www.epa.gov/waterdata/nhdplus-national-hydrography-dataset-plus) datasets.

## Installation

CRAN policy is that no package can write to a persistent location by default. As a result, `nhdR` writes all data to a temporary location unless a `temporary = FALSE` argument is passed to the `nhd_plus_get`/`nhd_get` functions. Alternatively, `nhdR` will automatically write data to a persistent location if the `nhdR_path` environment variable is set. To do this, add the following line to your `.Rprofile`:
```r
Sys.setenv(nhdR_path = file.path(rappdirs::user_data_dir(appname = "nhdR",
appauthor = "nhdR")))
```

Your `.Rprofile` file can be edited using the `usethis::edit_r_profile()` function.

### Stable version from CRAN

```{r, eval=FALSE}
install.packages("nhdR")
```

### or development version from GitHub

```{r gh-installation, eval = FALSE}
# install.packages("devtools")
devtools::install_github("jsta/nhdR")
```

This package also requires an installation of [7-zip](https://www.7-zip.org/) that can be called via the command line using `7z` or `7za.exe` (check if your machine is good to go with `nhdR:::has_7z()`).

## Usage
### Load package

```{r message=FALSE, results='hide'}
library(nhdR)
```

### NHD Plus

NHD-Plus exports are organized by vector processing unit (vpu). See below for a low resolution vpu map (also `nhdR::vpu_shp`). A hi-res version can be found [here](https://www.epa.gov/waterdata/nhdplus-global-data).

```{r echo=FALSE, message=FALSE, warning=FALSE}
library(ggplot2)
library(stringr)

dt <- nhdR::vpu_shp # [,"UnitID"]
dt <- dt[dt$UnitType == "VPU", ]

centroid_xy <- sf::st_as_text(sf::st_geometry(sf::st_centroid(dt[, "UnitID"])))
extract_coords <- function(messy_centroid) {
res <- stringr::str_split(messy_centroid, "\\(", simplify = TRUE)[2]
res <- stringr::str_split(res, "\\)", simplify = TRUE)[1]
stringr::str_split(res, " ", simplify = TRUE)
}

coords <- data.frame(matrix(
as.numeric(unlist(lapply(centroid_xy, extract_coords))),
ncol = 2, byrow = TRUE), stringsAsFactors = FALSE)
names(coords) <- c("x", "y")
coords$x <- coords$x * -1
coords$UnitID <- dt[, "UnitID"]$UnitID

ggplot(dt) +
geom_sf(aes(fill = UnitID), show.legend = FALSE) +
xlim(126, 70) +
ylim(23, 52) +
geom_text(data = coords, aes(x = x, y = y, label = UnitID)) +
theme_minimal() +
theme(axis.title = element_blank()) +
ggtitle("Vector Processing Units (VPU)")
```

```{r eval=FALSE}
# get a vpu export
nhd_plus_get(vpu = 4, "NHDSnapshot")
nhd_plus_get(vpu = 4, "NHDPlusAttributes")
nhd_plus_get(vpu = 4, "NHDPlusCatchment")
```

``` r
# list layers
nhd_plus_list(vpu = 4, "NHDSnapshot")
#> [1] "NHDArea.dbf" "NHDAreaEventFC.dbf"
#> [3] "NHDAreaEventFC.shp" "NHDArea.shp"
#> [5] "NHDFCode.dbf" "NHDFlowline.dbf"
#> [7] "NHDFlowline.shp" "NHDFlowline.shp.xml"
#> [9] "NHDLine.dbf" "NHDLineEventFC.dbf"
#> [11] "NHDLineEventFC.shp" "NHDLine.shp"
#> [13] "NHDPoint.dbf" "NHDPointEventFC.dbf"
#> [15] "NHDPointEventFC.shp" "NHDPoint.shp"
#> [17] "NHDReachCode_Comid.dbf" "NHDReachCrossReference.dbf"
#> [19] "NHDWaterbody.dbf" "NHDWaterbody.shp"
nhd_plus_list(vpu = 4, "NHDPlusAttributes")
#> [1] "CumulativeArea.dbf" "DivFracMP.dbf"
#> [3] "elevslope.dbf" "HeadwaterNodeArea.dbf"
#> [5] "MegaDiv.dbf" "PlusARPointEvent.dbf"
#> [7] "PlusFlowAR.dbf" "PlusFlow.dbf"
#> [9] "PlusFlowlineLakeMorphology.dbf" "PlusFlowlineVAA.dbf"
#> [11] "PlusWaterbodyLakeMorphology.dbf"
nhd_plus_list(vpu = 4, "NHDPlusCatchment")
#> [1] "Catchment.dbf" "Catchment.shp" "featureidgridcode.dbf"
```

``` r
# get layer info
nhd_plus_info(vpu = 4, "NHDSnapshot", "NHDWaterbody")
```

#> [1] "Driver: ESRI Shapefile; number of rows: 31830 "
#> [2] "Feature type: wkbPolygon with 3 dimensions"
#> [3] "Extent: (-93.24332 40.43575) - (-73.61814 48.11344)"
#> [4] "CRS: +proj=longlat +datum=NAD83 +no_defs "
#> [5] "LDID: 87 "
#> [6] "Number of fields: 12 "
#> [7] " name type length typeName"
#> [8] "1 COMID 0 9 Integer"
#> [9] "2 FDATE 9 10 Date"
#> [10] "3 RESOLUTION 4 7 String"
#> [11] "4 GNIS_ID 4 10 String"
#> [12] "5 GNIS_NAME 4 65 String"
#> [13] "6 AREASQKM 2 19 Real"
#> [14] "7 ELEVATION 2 19 Real"
#> [15] "8 REACHCODE 4 14 String"
#> [16] "9 FTYPE 4 24 String"
#> [17] "10 FCODE 0 9 Integer"
#> [18] "11 SHAPE_LENG 2 19 Real"
#> [19] "12 SHAPE_AREA 2 19 Real"

```{r eval=FALSE, echo=FALSE}
info <- capture.output(nhd_plus_info(vpu = 4, "NHDSnapshot", "NHDWaterbody"))
# gsub("/home/jemma", "~", info)
info[2:length(info)]
```

``` r
# load layer
dt <- nhd_plus_load(vpu = 4, "NHDSnapshot", "NHDWaterbody")
#> Reading layer `NHDWaterbody' from data source
#> `/home/jemma/.local/share/nhdR/NHDPlus/GL_04_NHDSnapshot/NHDWaterbody.shp'
#> using driver `ESRI Shapefile'
#> Simple feature collection with 31830 features and 12 fields
#> Geometry type: POLYGON
#> Dimension: XYZ
#> Bounding box: xmin: -93.24332 ymin: 40.43575 xmax: -73.61814 ymax: 48.11344
#> z_range: zmin: 0 zmax: 0
#> Geodetic CRS: NAD83
```

### NHD

NHD exports are organized by US state.

```{r eval=FALSE}
nhd_get(state = c("DC", "HI"))
```

``` r
nhd_list(state = "DC")
#> [1] "ExternalCrosswalk" "NHDFCode"
#> [3] "NHDFeatureToMetadata" "NHDFlow"
#> [5] "NHDFlowlineVAA" "NHDMetadata"
#> [7] "NHDProcessingParameters" "NHDReachCodeMaintenance"
#> [9] "NHDReachCrossReference" "NHDSourceCitation"
#> [11] "NHDStatus" "NHDVerticalRelationship"
#> [13] "NHDPoint" "NHDFlowline"
#> [15] "NHDLine" "NHDArea"
#> [17] "NHDWaterbody" "NHDAreaEventFC"
#> [19] "NHDLineEventFC" "NHDPointEventFC"
#> [21] "WBDLine" "NonContributingDrainageArea"
#> [23] "NWISBoundary" "NWISDrainageArea"
#> [25] "WBDHU14" "WBDHU8"
#> [27] "WBDHU2" "WBDHU4"
#> [29] "WBDHU6" "WBDHU10"
#> [31] "WBDHU12" "WBDHU16"
#> [33] "HYDRO_NET_Junctions"
#> attr(,"driver")
#> [1] "OpenFileGDB"
#> attr(,"nlayers")
#> [1] 33
```

``` r
nhd_info(state = "DC", dsn = "NHDWaterbody")
#> Source: "/home/jemma/.local/share/nhdR/NHD_H_District_of_Columbia_State_GDB.gdb", layer: "NHDWaterbody"
#> Driver: OpenFileGDB; number of rows: 8011
#> Feature type: wkbPolygon with 3 dimensions
#> Extent: (-78.07095 38.52142) - (-76.82219 39.64683)
#> CRS: +proj=longlat +datum=NAD83 +no_defs
#> Number of fields: 13
#> name type length typeName
#> 1 Permanent_Identifier 4 40 String
#> 2 FDate 11 0 DateTime
#> 3 Resolution 0 0 Integer
#> 4 GNIS_ID 4 10 String
#> 5 GNIS_Name 4 65 String
#> 6 AreaSqKm 2 0 Real
#> 7 Elevation 2 0 Real
#> 8 ReachCode 4 14 String
#> 9 FType 0 0 Integer
#> 10 FCode 0 0 Integer
#> 11 VisibilityFilter 0 0 Integer
#> 12 Shape_Length 2 0 Real
#> 13 Shape_Area 2 0 Real
```

``` r
head(nhd_load(state = "DC", dsn = "NHDWaterbody"))
#> Reading layer `NHDWaterbody' from data source
#> `/home/jemma/.local/share/nhdR/NHD_H_District_of_Columbia_State_GDB.gdb'
#> using driver `OpenFileGDB'
#> Simple feature collection with 8011 features and 13 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XYZ
#> Bounding box: xmin: -78.07095 ymin: 38.52142 xmax: -76.82219 ymax: 39.64683
#> z_range: zmin: 0 zmax: 0
#> Geodetic CRS: NAD83
#> Reading query `SELECT * from NHDWaterbody LIMIT 1' from data source `/home/jemma/.local/share/nhdR/NHD_H_District_of_Columbia_State_GDB.gdb'
#> using driver `OpenFileGDB'
#> Simple feature collection with 1 feature and 13 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XYZ
#> Bounding box: xmin: -76.99652 ymin: 38.68957 xmax: -76.99631 ymax: 38.6897
#> z_range: zmin: 0 zmax: 0
#> Geodetic CRS: NAD83
#> Simple feature collection with 6 features and 13 fields
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: -77.5767 ymin: 38.68957 xmax: -76.99631 ymax: 39.5882
#> Geodetic CRS: WGS 84
#> Permanent_Identifier FDate Resolution GNIS_ID GNIS_Name
#> 1 46565431 2002-07-21 18:00:00 2
#> 2 51767181 2002-08-14 18:00:00 2
#> 3 51767223 2002-08-14 18:00:00 2
#> 4 51767287 2002-08-14 18:00:00 2
#> 5 51767709 2002-08-14 18:00:00 2
#> 6 51768273 2002-08-14 18:00:00 2
#> AreaSqKm Elevation ReachCode FType FCode VisibilityFilter Shape_Length
#> 1 0.000 NA 02070010004605 436 43624 0 0.0005402029
#> 2 0.002 NA 02070008004808 390 39004 50000 0.0017289109
#> 3 0.001 NA 02070008004829 390 39004 2000000 0.0013369633
#> 4 0.001 NA 02070008004860 390 39004 24000 0.0011083831
#> 5 0.002 NA 02070008005063 390 39004 50000 0.0016429957
#> 6 0.001 NA 02070008005335 390 39004 24000 0.0012442057
#> Shape_Area Shape
#> 1 1.879174e-08 POLYGON ((-76.99631 38.6896...
#> 2 1.954519e-07 POLYGON ((-77.56946 39.5881...
#> 3 1.239613e-07 POLYGON ((-77.56954 39.5567...
#> 4 8.130533e-08 POLYGON ((-77.57658 39.5250...
#> 5 1.745505e-07 POLYGON ((-77.46919 39.3298...
#> 6 8.126193e-08 POLYGON ((-77.2087 39.18799...
```