https://github.com/mdsumner/ngdal
'GDAL' Multi-dimensional Array Model
https://github.com/mdsumner/ngdal
Last synced: 10 months ago
JSON representation
'GDAL' Multi-dimensional Array Model
- Host: GitHub
- URL: https://github.com/mdsumner/ngdal
- Owner: mdsumner
- Created: 2025-02-25T02:40:54.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-25T02:41:00.000Z (over 1 year ago)
- Last Synced: 2025-08-18T14:55:44.302Z (10 months ago)
- Language: R
- Size: 8.79 KB
- Stars: 5
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.Rmd
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
---
output: github_document
editor_options:
chunk_output_type: console
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# ngdal
The goal of ngdal is to explore a multidimensional array model based on the [tidync package](https://docs.ropensci.org/tidync/).
Very very unstable, we're relying on a dev-branch of the gdalraster package at [mdsumner/gdalraster](https://github.com/mdsumner/gdalraster/).
Probably won't keep this name.
The value compared to tidync is that we can already hit single-end-point datacubes with GDAL, we don't have to go through the virtualization schemes used by xarray and co to append NetDF files. And we can
- explore GDAL's virtualization in multidim VRT
- use GDAL for a very wide range of sources and access schemes (earthdata, google cloud, netcdf, hdf5, grib, and Zarr)
There's no modelling of groups yet.
```r
library(gdalraster)
library(ngdal)
Sys.setenv(GS_NO_SIGN_REQUEST="YES")
Sys.setenv("GS)")
m <- ngdal:::model(ngdal:::source("/vsigs/gcp-public-data-arco-era5/ar/full_37-1h-0p25deg-chunk-1.zarr-v3"))
m@variables
```
```
# A tibble: 277 × 7
group id name ndims natts dim_coord active
1 / 0 latitude 1 1 NA NA
2 / 0 level 1 1 NA NA
3 / 0 longitude 1 1 NA NA
4 / 0 time 1 1 NA NA
5 / 0 100m_u_component_of_wind 3 2 NA NA
6 / 0 100m_v_component_of_wind 3 2 NA NA
7 / 0 10m_u_component_of_neutral_wind 3 2 NA NA
8 / 0 10m_u_component_of_wind 3 2 NA NA
9 / 0 10m_v_component_of_neutral_wind 3 2 NA NA
10 / 0 10m_v_component_of_wind 3 2 NA NA
# ℹ 267 more rows
# ℹ Use `print(n = ...)` to see more rows
```
```r
## that thing NetCDF doesn't define, but tidync does (I think xaray calls these Datasets within a tree each unique 'grid' a group of same-shape variables )
dplyr::distinct(m@grids, grid, rank)
```
```
# A tibble: 6 × 2
grid rank
1 1 1
2 2 1
3 3 1
4 4 1
5 4,1,3 3
6 4,2,1,3 4
```
```r
m@dimensions ## sets of dimindex define a "grid"
str(m@coordinates)
```
```
# A tibble: 4 × 3
name size dimindex
1 latitude 721 1
2 level 37 2
3 longitude 1440 3
4 time 1323648 4
List of 4
$ latitude : tibble [721 × 1] (S3: tbl_df/tbl/data.frame)
..$ value: num [1:721] 90 89.8 89.5 89.2 89 ...
$ level : tibble [37 × 1] (S3: tbl_df/tbl/data.frame)
..$ value: num [1:37] 1 2 3 5 7 10 20 30 50 70 ...
$ longitude: tibble [1,440 × 1] (S3: tbl_df/tbl/data.frame)
..$ value: num [1:1440] 0 0.25 0.5 0.75 1 1.25 1.5 1.75 2 2.25 ...
$ time : tibble [1,323,648 × 1] (S3: tbl_df/tbl/data.frame)
..$ value: num [1:1323648] 0 1 2 3 4 5 6 7 8 9 ...
```
Try a file on earthdata (we have set "GDAL_HTTP_HEADER_FILE")
```r
dsn <- "/vsicurl/https://archive.podaac.earthdata.nasa.gov/podaac-ops-cumulus-protected/MUR-JPL-L4-GLOB-v4.1/20250223090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc"
ngdal:::model(ngdal:::source(dsn))
```
```
@ source :
.. @ description: chr "/vsicurl/https://archive.podaac.earthdata.nasa.gov/podaac-ops-cumulus-protected/MUR-JPL-L4-GLOB-v4.1/2025022309"| __truncated__
.. @ family : chr(0)
@ rawinfo : chr "{ \"type\": \"group\", \"driver\": \"netCDF\", \"name\": \"/\", \"attributes\": { \"Conventions\": \"CF-"| __truncated__
@ variables : tibble [9 × 7] (S3: tbl_df/tbl/data.frame)
$ group : chr [1:9] "/" "/" "/" "/" ...
$ id : num [1:9] 0 0 0 0 0 0 0 0 0
$ name : chr [1:9] "time" "lat" "lon" "analysed_sst" ...
$ ndims : int [1:9] 1 1 1 3 3 3 3 3 3
$ natts : int [1:9] 4 6 6 7 5 8 7 6 5
$ dim_coord: logi [1:9] NA NA NA NA NA NA ...
$ active : logi [1:9] NA NA NA NA NA NA ...
@ coordmeta : tibble [3 × 7] (S3: tbl_df/tbl/data.frame)
$ group : chr [1:3] "/" "/" "/"
$ id : num [1:3] 0 0 0
$ name : chr [1:3] "time" "lat" "lon"
$ ndims : int [1:3] 1 1 1
$ natts : int [1:3] 4 6 6
$ dim_coord: logi [1:3] NA NA NA
$ active : logi [1:3] NA NA NA
@ axes : tibble [21 × 3] (S3: tbl_df/tbl/data.frame)
$ variable : chr [1:21] "time" "lat" "lon" "analysed_sst" ...
$ dimension: Named chr [1:21] "/time" "/lat" "/lon" "/time" ...
..- attr(*, "names")= chr [1:21] "time" "lat" "lon" "analysed_sst1" ...
$ dimindex : int [1:21] 1 2 3 1 2 3 1 2 3 1 ...
@ grids : tibble [6 × 3] (S3: tbl_df/tbl/data.frame)
$ dimension: Named chr [1:6] "/time" "/lat" "/lon" "/time" ...
..- attr(*, "names")= chr [1:6] "time" "lat" "lon" "analysed_sst1" ...
$ grid : chr [1:6] "1" "2" "3" "1,2,3" ...
$ rank : int [1:6] 1 1 1 3 3 3
@ coordinates:List of 3
.. $ time: tibble [1 × 1] (S3: tbl_df/tbl/data.frame)
.. ..$ value: num 1.39e+09
.. $ lat : tibble [17,999 × 1] (S3: tbl_df/tbl/data.frame)
.. ..$ value: num [1:17999] -90 -90 -90 -90 -89.9 ...
.. $ lon : tibble [36,000 × 1] (S3: tbl_df/tbl/data.frame)
.. ..$ value: num [1:36000] -180 -180 -180 -180 -180 ...
@ dimensions : tibble [3 × 3] (S3: tbl_df/tbl/data.frame)
$ name : chr [1:3] "time" "lat" "lon"
$ size : int [1:3] 1 17999 36000
$ dimindex: int [1:3] 1 2 3
```
## Code of Conduct
Please note that the ngdal project is released with a [Contributor Code of Conduct](https://contributor-covenant.org/version/2/1/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.