https://github.com/hypertidy/controlledburn
rasterize but don't materialize
https://github.com/hypertidy/controlledburn
compression lazy polygons raster
Last synced: 8 months ago
JSON representation
rasterize but don't materialize
- Host: GitHub
- URL: https://github.com/hypertidy/controlledburn
- Owner: hypertidy
- License: other
- Created: 2022-07-08T02:02:09.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-12-23T19:47:03.000Z (about 3 years ago)
- Last Synced: 2023-02-27T07:03:13.129Z (almost 3 years ago)
- Topics: compression, lazy, polygons, raster
- Language: C++
- Homepage: https://hypertidy.github.io/controlledburn
- Size: 1.03 MB
- Stars: 6
- Watchers: 2
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.Rmd
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# controlledburn
[](https://github.com/hypertidy/controlledburn/actions/workflows/R-CMD-check.yaml)
The goal of controlledburn is to rasterize without materializing any pixel values.
A very fast rasterization algorithm for polygons in [fasterize](https://github.com/ecohealthalliance/fasterize) does the following:
- restructures polygons to edges
- considers only non-horizontal polygon edges and indexes them in y,x order (slope always down)
- determines all affected raster rows and scans these for edge *start* and *end* column
- burns polygon value or identity into every pixel between those starts and ends
controlledburn avoids that last step. With these rasterization functions the return value is not a set of pixels or a side-effect of
a materialized raster file but simply those row and column start and end indexes.
This is an expression of the *cell abstraction* wish item [fasterize/issues/11](https://github.com/ecohealthalliance/fasterize/issues/11).
## TODO
- [ ] move to cpp11
- [ ] rasterize lines [fasterize/issues/30](https://github.com/ecohealthalliance/fasterize/issues/30)
- [ ] formats for import (wk, geos, grd, rct, triangles etc.)
- [ ] streaming with wkb/xy unpack with wk
- [ ] provide output options (see next section)
- [ ] port back into fasterize, with options for efficiently writing out to a tiled and sparse GeoTIFF
- [x] points is too easy, see vaster::cell_from_xy
- [x] name the package
- [x] copy logic from fasterize, and remove Armadillo array handling
- [x] remove use of raster objects, in favour of input extent and dimension
- [x] remove all trace of the raster package
- [x] implement return of the 'yline, xpix' and polygon ID info to user (see below)
- [x] make return of ylin,xpix structure efficient (CollectorList.h ftw)
Lines is working but still has some problems. For polygons it's pretty good, see [fasterize #6](https://github.com/ecohealthalliance/fasterize/issues/6) for a remaining issue.
## Outputs
Internal function `burn_polygon()` has arguments `sf`, `extent`, `dimension`. The first is a sf polygons data frame, extent is `c(xmin, xmax, ymin, ymax)` and dimension is `c(ncol, nrow)`. In fasterize you need an actual non-materialized *RasterLayer* object, but all that was
really used for was the six numbers extent, dimension and as the shell for the in-memory output.
The output of `burn_polygon()` is a list of four-element indexes `start,end,row,poly_id` - these are zero-based atm because they reflect the underlying C++ code. Examples shown here flatten this to a 4-column matrix (and add 1).
These options are still in play for what the interface/s could do:
* record presence of polygon (this is all we have currently) OR ID of polygon
* tools to format this meaningfully, and plot lazily (see example for quick plot)
* tools to materialize as actual raster data
I also found this real world example for which this is relevant, discussed in PROJ for
very fast lookup for large non-materialized (highly compressed) grids by Thomas
Knudsen:
https://github.com/OSGeo/PROJ/issues/1461#issuecomment-491501992
## Installation
You can install the development version of controlledburn like so:
``` r
remotes::install_github("hypertidy/controlledburn")
```
## Example
This is a basic example, this is fast, and shows that it works.
```{r example}
pols <- silicate::inlandwaters
library(vaster)
## define a raster (xmin, xmax, ymin, ymax), (ncol, nrow)
ext <- unlist(lapply(silicate::sc_vertex(pols), range))
dm <- c(50, 40)
r <- controlledburn:::burn_polygon(pols, extent = ext,
dimension = dm)
## our index is triplets of start,end,line where the polygon edge was detected -
## this essentially an rle by scanline of start,end polygon coverage
index <- matrix(unlist(r, use.names = F), ncol = 4L, byrow = TRUE) + 1 ## plus one because 0-index internally
## plot just the start and ends of each scanline detected
xy0 <- vaster::xy_from_cell(dm, ext, vaster::cell_from_row_col(dm, index[,c(3, 3)], index[,1:2]))
plot(xy0)
plot(silicate::PATH0(pols), add = TRUE)
## expand out to every cell
cr <- do.call(rbind, apply(index, 1,\(.x) cbind(seq(.x[1], .x[2]), .x[3], .x[4])))
xy <- vaster::xy_from_cell(dm, ext, vaster::cell_from_row_col(dm, cr[,2], cr[,1]))
plot(xy, pch = 19, cex = .3, col = palr::d_pal(cr[,3]))
plot(silicate::PATH0(pols), add = TRUE)
```
It scales to very large tasks, with small output.
```{r fast}
dm <- c(500000, 400000)
system.time(r <- controlledburn:::burn_polygon(pols, extent = ext,
dimension = dm))
length(r)
## consider a prod(dm) raster of type double (or even bool) obviously would compress again but why churn
pryr::object_size(r)
```
The following is inefficient, but shows that we get the right result.
```{r bad}
dm <- c(500, 400)
system.time(r <- controlledburn:::burn_polygon(pols, extent = ext,
dimension = dm))
index <- matrix(unlist(r, use.names = F), ncol = 4L, byrow = TRUE) + 1 ## plus one because 0-index internally
## now go inefficient, this is every column,row index, then converted to cell, converted to xy
cr <- do.call(rbind, apply(index, 1, \(.x) cbind(seq(.x[1], .x[2]), .x[3], .x[4])))
xy <- vaster::xy_from_cell(dm, ext, vaster::cell_from_row_col(dm, cr[,2], cr[,1]))
plot(xy, pch = ".", col = palr::d_pal(cr[,3]))
rr <- terra::rast(cbind(xy, 0), type = "xyz")
rr[terra::cellFromXY(rr, xy)] <- 1
terra::plot(rr ,col = "firebrick", asp = NA)
plot(silicate::SC0(pols), add = TRUE)
```
## Code of Conduct
Please note that the controlledburn project is released with a [Contributor Code of Conduct](https://contributor-covenant.org/version/2/1/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.