{"id":13736214,"url":"https://github.com/redfoxgis/tree_segmentation","last_synced_at":"2025-05-08T12:32:31.510Z","repository":{"id":133063375,"uuid":"226789222","full_name":"redfoxgis/tree_segmentation","owner":"redfoxgis","description":"LiDAR tree segmentation","archived":false,"fork":false,"pushed_at":"2020-01-26T05:54:30.000Z","size":98441,"stargazers_count":33,"open_issues_count":0,"forks_count":5,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-11-15T04:32:04.596Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/redfoxgis.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2019-12-09T05:08:15.000Z","updated_at":"2024-11-05T16:26:39.000Z","dependencies_parsed_at":"2024-01-06T10:23:37.034Z","dependency_job_id":null,"html_url":"https://github.com/redfoxgis/tree_segmentation","commit_stats":null,"previous_names":[],"tags_count":0,"template":true,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/redfoxgis%2Ftree_segmentation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/redfoxgis%2Ftree_segmentation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/redfoxgis%2Ftree_segmentation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/redfoxgis%2Ftree_segmentation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/redfoxgis","download_url":"https://codeload.github.com/redfoxgis/tree_segmentation/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253068909,"owners_count":21848889,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-03T03:01:17.605Z","updated_at":"2025-05-08T12:32:31.500Z","avatar_url":"https://github.com/redfoxgis.png","language":"R","funding_links":[],"categories":["Resources for `R`"],"sub_categories":["Testing your code"],"readme":"# Tree Segmentation\nThis is the code repository for the Tree Segmentation Project. The purpose for this repository:\n\n- To document detailed steps in processing LiDAR.\n- To visualize results.\n- To provide code to segment and create individual tree hulls.\n\n![Image of lastrees output](./media/trees.png)\n\n## Dependencies\n* lidR        (2.14+)\n* rlas        (1.3.4+)\n* rgdal       (1.4-8+)\n* tictoc      (1.0+)\n* sp          (1.3-2+)\n* concaveman  (1.0.0+)\n\n## How to run\nThe R script to process point cloud data and output tree canopy polygons is called **lidar_processing_pipeline.R**\n\nThere should be an input directory containing a collection of 1 or many las/laz files. \n\ne.g.\n```R\nfiles \u003c- list.files(path=\"/path/to/input_las_files\", pattern=\"*.las\", full.names=TRUE, recursive=FALSE)\n```\n\nThere should also be a path to a directory which will contain the shapefile/s of tree canopies.\n\ne.g.\n\n```R\noutws \u003c- \"/path/to/output_shps\"\n```\n\n# Processing LiDAR point cloud data\n\nThis tutorial builds on the `lidR` tutorial [Segment individual trees and compute metrics](https://github.com/Jean-Romain/lidR/wiki/Segment-individual-trees-and-compute-metrics) by exploring in-depth the process of preparing the raw point cloud and tree segmentation. \n\n## Overview\n* Downloading data\n* Inspecting the point cloud data\n* Filtering point cloud data\n* Generating a canopy height model\n* Individual tree detection\n* Summary\n\n### Downloading data\nLet's start a las tile from the City of Vancouver with a nice mixture of buildings and trees. The City of Vancouver has a really nice web interface:\n\n![City of Vancouver LiDAR Web Server](./media/lidar_server.png)\n\nFor this tutorial, we are going to download and work with [tile 4810E_54560N](https://webtransfer.vancouver.ca/opendata/2018LiDAR/4810E_54560N.zip)\n\nBefore we even unzip the downloaded file, let's inspect all of the available metadata to get a sense of how much we know about the data. Luckily, the web interface has a [nice metadata page](https://opendata.vancouver.ca/explore/dataset/lidar-2018/information/?location=12,49.2594,-123.14438). We can see from the metadata a few important features:\n\n- The projected coordinate system is NAD 83 UTM Zone 13N\n- Points density is 30 pts / m^2\n- Data was acquired from August 27th and August 28th, 2018 (i.e. leaf-on conditions)\n- Points were classified as follows\n\n      1. Unclassified;\n      2. Bare-earth and low grass;\n      3. Low vegetation (height \u003c2m);\n      4. High vegetation (height \u003c2m);\n      5. Water;\n      6. Buildings;\n      7. Other; and\n      8. Noise (noise points, blunders, outliners, etc)\n      \n### Inspecting the point cloud data\nNow we will begin inspecting the raw point cloud data using the R package `lidR`.\n\nImport packages we will use in this tutorial\n\n```R\nrequire(lidR)\nrequire(rlas) # Necessary for writelax\nrequire(rgdal) # Writing to shp or raster\nrequire(tictoc) # for tic() toc() function\n\n```\n\nLet's read in the las file\n```R\ndata \u003c- /path/to/your/pointclouddata.las`\nlas \u003c- readLAS(data) # Read in all of the data\n```\n\nand inspect the data\n\n```R\nlascheck(las)\n```\n\n```R\n Checking the data\n  - Checking coordinates... ✓\n  - Checking coordinates type... ✓\n  - Checking attributes type... ✓\n  - Checking ReturnNumber validity... ✓\n  - Checking NumberOfReturns validity... ✓\n  - Checking ReturnNumber vs. NumberOfReturns... ✓\n  - Checking RGB validity... ✓\n  - Checking absence of NAs... ✓\n  - Checking duplicated points...\n   ⚠ 6337 points are duplicated and share XYZ coordinates with other points\n  - Checking degenerated ground points... ✓\n  - Checking attribute population...\n   ⚠ 'ScanDirectionFlag' attribute is not populated.\n Checking the header\n  - Checking header completeness... ✓\n  - Checking scale factor validity... ✓\n  - Checking Point Data Format ID validity... ✓\n  - Checking extra bytes attributes validity... ✓\n  - Checking coordinate reference sytem... ✓\n Checking header vs data adequacy\n  - Checking attributes vs. point format... ✓\n  - Checking header bbox vs. actual content... ✓\n  - Checking header number of points vs. actual content... ✓\n  - Checking header return number vs. actual content... ✓\n Checking preprocessing already done \n  - Checking ground classification... yes\n  - Checking normalization... no\n  - Checking negative outliers...\n   ⚠ 137970 points below 0\n  - Checking flightline classification... yes\n```\nYou can see that `lascheck()` provides useful quality control information about the LiDAR data.\n\nWe can also get some basic information about the point cloud using \n\n```R\nsummary(las)\n```\n\n```R\nclass        : LAS (LASF v1.2)\npoint format : 1\nmemory       : 3.7 Gb \nextent       :481000, 482000, 5456000, 5457000 (xmin, xmax, ymin, ymax)\ncoord. ref.  : +proj=utm +zone=10 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs \narea         : 1 km²\npoints       : 47.36 million points\ndensity      : 47.37 points/m²\nnames        : X Y Z gpstime Intensity ReturnNumber NumberOfReturns ScanDirectionFlag EdgeOfFlightline Classification Synthetic_flag Keypoint_flag Withheld_flag ScanAngleRank UserData PointSourceID \nFile signature:           LASF \nFile source ID:           0 \nGlobal encoding:\n - GPS Time Type: Standard GPS Time \n - Synthetic Return Numbers: no \n - Well Know Text: CRS is GeoTIFF \n - Aggregate Model: false \nProject ID - GUID:        00000000-0000-0000-0000-000000000000 \nVersion:                  1.2\nSystem identifier:        LAStools (c) by rapidlasso GmbH \nGenerating software:      las2las (version 181119) \nFile creation d/y:        7/2019\nheader size:              227 \nOffset to point data:     323 \nNum. var. length record:  1 \nPoint data format:        1 \nPoint data record length: 28 \nNum. of point records:    47360009 \nNum. of points by return: 33908912 9530523 3165826 660943 85597 \nScale factor X Y Z:       0.01 0.01 0.01 \nOffset X Y Z:             4e+05 5e+06 0 \nmin X Y Z:                481000 5456000 -397.71 \nmax X Y Z:                482000 5457000 308.65 \nVariable length records: \n   Variable length record 1 of 1 \n       Description: by LAStools of rapidlasso GmbH \n       Tags:\n          Key 1024 value 1 \n          Key 3072 value 3157 \n          Key 3076 value 9001 \n          Key 4099 value 9001 \n```\nOf particular interest is the projected coordinate system and point density.\n\nNow let's inspect the classes\n```R\nsort(unique(las@data$Classification))\n```\n\n`[1] 1 2 3 5 6 7 9`\n\nFrom this, we can see which classes are missing from this las tile. An inspection of the City of Vancouver LiDAR classification and the ASPRS classification specifications shows that the classes are not aligned:\n\n#### The City classes:\n\n![City Classes](./media/ubc_classes.png)\n\n#### The ASPRS class specifications\n\n![ASPRS Classes](./media/asprs_classes.png)\n\nThis is unusual, so let's take a look at the the classified point cloud data to see what is going on.\n\n```r\nplot(las, color = \"Classification\")\n```\n\n![Classified Point Cloud](./media/las_classes.png)\n\nWe can select individual classes to inspect them closer\n\n```r\nlas_class \u003c- lasfilter(las, Classification == 5)\nplot(las_class)\n```\n![Class 5 Trees](./media/class5_trees.png)\n\nAfter inspecting all of the classes, it appears as if the LiDAR tiles are in fact classified to ASPRS classification standards. However, when observing class 5 (High Vegetation), it became apparent that there were several outliers we will need to remove.\n\n![Outliers](./media/outliers.png)\n\n### Filtering point cloud data\nHere we are going to filter out all of the classes except for our classes of interest\n\n```R\nlas \u003c- readLAS(data, filter=\"-keep_class 2 5\") # Keep high vegetation and ground point classes`\n```\nThen, normalize the data so that ground points are centered on 0.\n\n```R\ndtm \u003c- grid_terrain(las, algorithm = knnidw(k = 8, p = 2))\nlas_normalized \u003c- lasnormalize(las, dtm)\n```\nThere is an excellent example of using a filter to remove points above the 95th percentile of height in the [`lidR` documentation.](https://cran.r-project.org/web/packages/lidR/vignettes/lidR-catalog-apply-examples.html). This is how we implement the filter:\n\n```R\n# Create a filter to remove points above 95th percentile of height\nlasfilternoise = function(las, sensitivity)\n{\n  p95 \u003c- grid_metrics(las, ~quantile(Z, probs = 0.95), 10)\n  las \u003c- lasmergespatial(las, p95, \"p95\")\n  las \u003c- lasfilter(las, Z \u003c p95*sensitivity)\n  las$p95 \u003c- NULL\n  return(las)\n}\n\nlas_denoised \u003c- lasfilternoise(las_normalized, sensitivity = 1.2)\n```\n\nYou can see the filter does a good job removing most outliers\n\n#### Before filtering\n![Noisy Las](./media/las_noisy.png)\n\n#### After filtering\n![Denoised Las](./media/las_denoised.png)\n\n### Generating a canopy height model\nNow that we have classes isolated and outliers filtered we can generate a canopy height model (CHM), which will be the basis for segmenting and classifying our trees. It is important to note that we are primarily interested in surface characteristics of the tree canopy. Therefore, it is not necessary to ensure that the points are uniformly distributed as would be the case in analyses where vertical point distribution is important such as grid metrics.\n\nThere have been several good tutorials on generating perfect canopy height models, incuding [this](https://github.com/Jean-Romain/lidR/wiki/Rasterizing-perfect-canopy-height-models) from the authors of lidR and [this](https://rapidlasso.com/2014/11/04/rasterizing-perfect-canopy-height-models-from-lidar/) from Martin Isenburg.\n\nWe are going to use a pitfree CHM generated in `lidR`.\n\n```R\nchm \u003c- grid_canopy(las_denoised, 0.5, pitfree(c(0,2,5,10,15), c(3,1.5), subcircle = 0.2))\n```\n\n`lidR` provides a nice way to visualize raster elevation data in 3D.\n```R\nplot_dtm3d(chm)\n```\n\n![3D CHM](./media/chm-gif.gif)\n\nOur objective is to generate polygons of individual tree canopies (hulls). Often times it is helpful to apply a 3x3 or 5x5 median filter to help smooth the canopy height model prior to tree detection. Applying a median filter helps define the boundary of the tree canopy and can lead to better results when delineating individual trees, especially in areas with a closed canopy.\n\nHere a single 5x5 moving window is used to apply a median filter: \n```R\nker \u003c- matrix(1,5,5)\nchm_s \u003c- focal(chm, w = ker, fun = median)\n```\n\n### Individual tree detection\nWe are going to use a watershed algorithm for the tree detection with a height threshold of 4m\n\n```R\nalgo \u003c- watershed(chm_s, th = 4)\nlas_watershed  \u003c- lastrees(las_denoised, algo)\n\n# remove points that are not assigned to a tree\ntrees \u003c- lasfilter(las_watershed, !is.na(treeID))\n\n# View the results\nplot(trees, color = \"treeID\", colorPalette = pastel.colors(100))\n```\n![las trees](./media/las-trees-gif.gif)\n\nGreat! An initial inspection of the tree segmentation shows positive results--time to delineate tree canopies.\n\n```R\nhulls  \u003c- tree_hulls(trees, type = \"concave\", concavity = 2, func = .stdmetrics)\n```\n\nThe individual tree canopy polygons (hulls) appear to look great. \n\n![tree hulls](./media/tree-hulls.png)\n\nAn added bonus is that we also summarized point cloud metrics within each polygon when we included `func = .stdmetrics` in the `tree_hulls` function. This allows us to do many thing such as quickly apply statistical filters, classify trees using machine learning approaches, and visualize individual tree attributes. \n\nFor example, the following map shows the maximum height (zmax) within each tree hull.\n\n![tree hulls zmax](./media/trees-zmax.png)\n\n### Summary\n\nEvery LiDAR based project will be different--point cloud data may range from a csv file of xyz coords to fully preprocessed and classified las data. However, the fundamentals of how we approach the project remain constant. Before any analysis is conducted it is necessary to thoroughly identify abnormalities and errors in the LiDAR dataset and how these will effect the analysis. For example, \n\n* Is there any metadata?\n* What, if any, coordinate system is used?  \n* What is the point density?\n* Have the points been classified and, if so, is the classification accurate and usable?\n* Will the flight line overlap impact my analysis?\n* Do the points need to be thinned, regularized, or filtered for analyses such as calculating grid metrics?\n* Will outliers need to be filtered?\n* Is the point cloud normalized?\n\nTree segmentation and delineation accuracy will vary based on forest cover type. Generally speaking tree segmentation in open conifer forests will yield higher accuracy than mixed, closed canopy forest types. [Hastings et al. (2020)](https://www.mdpi.com/619942) provides a useful accuracy assessment of individual tree crown delineation (ITCD) algorithms in temporate forests.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fredfoxgis%2Ftree_segmentation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fredfoxgis%2Ftree_segmentation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fredfoxgis%2Ftree_segmentation/lists"}