Research Work

Ko, C., T. Tran and G. Sohn. 2020. Single 3D Tree Detection from LiDAR Point Cloud Using Deep Learning Network

The automatic creation and update of tree inventory provides an opportunity to better manage tree as a precious resource in both natural and urban environment. Detecting and classifying individual tree species automatically is still one of the most challenging problems to be resolved. The goal of this project is to automatically detect 3D bounding boxes for individual trees from airborne LiDAR data using state-of-the art single-stage 3D object detection methods. According to Guo et al. (2019), there are five broad categories for 3D object detection methods, 1) Multi-view methods, 2) Segmentation-based methods, 3) Frustum-based methods, 4) Point cloud-based methods, and 5) BEV (Bird eye view) -based methods. The Deep Convolution Neural Network (DCNN) we are choosing fall into point cloud-based method using point cloud directly as input. Our study area is at the York University Keele Campus, Toronto, Ontario, Canada. We acquired three sets of data for the project 1) Airborne LiDAR 2) Mobile LiDAR and 3) field data. We acquired two sets of LiDAR data during September 2018, the ground (mobile) data and the airborne data. The ground data collection was carried out by a mobile mapping system consist of Lynx HS600 (Teledyne Optech, Toronto, Ontario) with and Ladybug 360° camera with six integrated cameras on 8th September 2018. The airborne mission took place on 23rd September 2018 with two ALS sensors Galaxy-PRIME and Titan (Teledyne Optech, Toronto, Ontario). The average flying height for the mission was 1829m (6000ft) agl. For the field data, 5717 trees on campus were recorded with species name, tree crown width [m], and locations obtained by a handheld GPS.

Training data is generated semi-automatically with a marker-controlled watershed segmentation method (Markan, 2018) to first over generate detected local maximums (treetops) and watersheds (tree crown boundaries). By merging the representative watershed candidates, we produced 5717 tree crown boundaries. One of the advantages of this method over a simple marker-controlled watershed segmentation is that this method will allow delineated tree crowns to overlap which is common even in the urban environment. Ground truth bounding boxes are then generated from each 2D watersheds and the height of individual trees. While current 3D object detection methods focus on the detection of vehicle and pedestrians for autonomous driving applications, we aim to test the applicability of existing methods on tree objects. Current state-of-the-art methods we are considering includes VoxelNet (Zhou and Tuzel, 2018), SECOND (Yan et al., 2018), Point pillars (Lang et al., 2018), and Part-A^2 (Shi et al., 2019). Regarding 3D object detection modules, tree point cloud data are sampled and processed into representations suitable for deep convolution neural networks. Several representations as 3D-voxelization-based representation and bird’s-eye-view-based representation are being explored. The representations will be passed through a trainable neural network to generate predicted 3D bounding box for trees. Improving detection accuracy not only affect classification results and will also improve existing tree locations that we traditionally recorded by handheld GPS or estimated from google street view imagery.

Keywords— Object detection, deep learning, lidar, tree, 3D bounding box


Yulan Guo, Y., Hanyun Wang, H. Qingyong Hu, Q., Hao Liu, H., Li Liu, L., and Mohammed Bennamoun, M. (2019). “Deep learning for 3d point clouds: A survey.” arXiv preprint arXiv:1912.12033.

Lang, A.H., Vora, S., Caesar, H., Zhou, L., Yang, J., and Beijbom, O. 2018. “PointPillars: Fast encoders for object detection from point clouds,” arXiv preprint arXiv:1812.05784.

Matthew Parkan, M. (2018). “Digital Forestry Toolbox for Matlab/Octave.” DOI: 10.5281/zenodo.1213013. Available:

Shi, S., Wang, Z., Wang, X., and Li, H. 2019. “Part-Aˆ 2 Net: 3D part aware and aggregation neural network for object detection from point cloud,” arXiv preprint arXiv:1907.03670, 2019

Yan, Y., Mao, Y., and Li, B. 2018. “SECOND: Sparsely embedded convolutional detection,” Sensors, 18 (10), p. 3337

Zhou, Y., and Tuzel, O. 2018. “VoxelNet: End-to-end learning for point cloud based 3D object detection,” in CVPR, 2018, pp. 4490–4499.

Ko, C., Kang, J. and G. Sohn. 2018. Deep Multi-task learning for tree genera classification. ISPRS Technical Commission II Symposium 2018 “Towards Photogrammetry 2020”, June 3-7, Riva del Garda, Italy.

The goal for our paper is to classify tree genera using airborne Light Detection and Ranging (LiDAR) data with Convolution Neural Network (CNN) – Multi-task Network (MTN) implementation. Unlike Single-task Network (STN) where only one task is assigned to the learning outcome, MTN is a deep learning architect for learning a main task (classification of tree genera) with other tasks (in our study, classification of coniferous and deciduous) simultaneously, with shared classification features. The main contribution of this paper is to improve classification accuracy from CNN-STN to CNN-MTN. This is achieved by introducing a concurrence loss () to the designed MTN. This term regulates the overall network performance by minimizing the inconsistencies between the two tasks.  Results show that we can increase the classification accuracy from 88.7% to 91.0% (from STN to MTN). The second goal of this paper is to solve the problem of small training sample size by multiple-view data generation. The motivation of this goal is to address one of the most common problems in implementing deep learning architecture, the insufficient number of training data. We address this problem by simulating training dataset with multiple-view approach. The promising results from this paper are providing a basis for classifying a larger number of dataset and number of classes in the future.

Example of a pine LiDAR tree projected on to x-z plane (a) shows the results when the origin represents the x-y centroid of the tree, the lowest point of the recorded LiDAR point is zero. Grey value represents the location of the LiDAR point out of the paper (+) and into the paper (-) (b) shows the results of the same tree on the same axes where grey scale represents height above the lowest recorded LiDAR point. (c) and (d) are the resulted image produced from (a) and (b), respectively. (e) is the result of combining (c) and (d) into a 2-channel image where (c) represents red channel and (d) represents green channel.
Summary of MTN network

Ko, C., Sohn, G., Remmel, T.K. and Miller, J., 2016. Maximizing the diversity of ensemble Random Forests for tree genera classification using high density LiDAR data. Remote Sensing 8(8):646.

Recent research into improving the effectiveness of forest inventory management using airborne LiDAR data has focused on developing advanced theories in data analytics. Furthermore, supervised learning as a predictive model for classifying tree genera (and species, where possible) has been gaining popularity in order to minimize this labor-intensive task. However, bottlenecks remain that hinder the immediate adoption of supervised learning methods. With supervised classification, training samples are required for learning the parameters that govern the performance of a classifier, yet the selection of training data is often subjective and the quality of such samples is critically important. For LiDAR scanning in forest environments, the quantification of data quality is somewhat abstract, normally referring to some metric related to the completeness of individual tree crowns; however, this is not an issue that has received much attention in the literature. Intuitively the choice of training samples having varying quality will affect classification accuracy. In this paper a Diversity Index (DI) is proposed that characterizes the diversity of data quality (Qi) among selected training samples required for constructing a classification model of tree genera. The training sample is diversified in terms of data quality as opposed to the number of samples per class. The diversified training sample allows the classifier to better learn the positive and negative instances and; therefore; has a higher classification accuracy in discriminating the “unknown” class samples from the “known” samples. Our algorithm is implemented within the Random Forests base classifiers with six derived geometric features from LiDAR data. The training sample contains of three tree genera (pine; poplar; and maple) and the validation samples contains of four labels (pine; poplar; maple; and “unknown”). Classification accuracy improved from 72.8%; when training samples were selected randomly (with stratified sample size); to 93.8%; when samples were selected with additional criteria; and from 88.4% to 93.8% when an ensemble method was used.

Ko, C., Sohn, G., Remmel, T.K. and Miller, J., 2014. Hybrid ensemble classification of tree genera using airborne LiDAR data. Remote Sensing. 6:11225–11243

This paper presents a hybrid ensemble method that is comprised of a sequential and a parallel architecture for the classification of tree genus using LiDAR (Light Detection and Ranging) data. The two classifiers use different sets of features: (1) features derived from geometric information, and (2) features derived from vertical profiles using Random Forests as the base classifier. This classification result is also compared with that obtained by replacing the base classifier by LDA (Linear Discriminant Analysis), kNN (k Nearest Neighbor) and SVM (Support Vector Machine). The uniqueness of this research is in the development, implementation and application of three main ideas: (1) the hybrid ensemble method, which aims to improve classification accuracy, (2) a pseudo-margin criterion for assessing the quality of predictions and (3) an automatic feature reduction method using results drawn from Random Forests. An additional point-density analysis is performed to study the influence of decreased point density on classification accuracy results. By using Random Forests as the base classifier, the average classification accuracies for the geometric classifier and vertical profile classifier are 88.0% and 88.8%, respectively,with improvement to 91.2% using the ensemble method. The training genera include pine, poplar, and maple within a study area located north of Thessalon, Ontario, Canada.

Ko, C., Sohn, G. and T.K., Remmel, 2013. A spatial Analysis of Geometric Features Derived from High–Density Airborne LiDAR Data for Tree Species Classification. Canadian Journal of Remote Sensing, 2013, 39(s1):S73–S85, 10.5589/m13-024)

Categorical recognition of a tree’s genus is known to be valuable information for the effective management of forest inventories. In this paper, we present a method for learning a discriminative model using Random Forests to classify individual trees into three genera: pine, poplar, and maple. We believe that both internal and external geometric characteristics of the tree crown are related to tree form and therefore useful in classifying trees to the genus level. Our approach involves the extraction of both internal and external geometric features from a LiDAR point cloud as we believe that geometric features provide important information about the organization of the points inside the tree crown along with overall tree shape and form. We developed 24 geometric features and then reduced the number of features to increase efficiency. These geometric characteristics, computed for 160 sampled trees from eight field sites, were classified using Random Forests and achieved an 88.3% average accuracy rate by using 25% (40 trees) of the data for training.

Ko, C., Remmel, T.K. and Sohn, G., 2012. Mapping tree genera using discrete LiDAR and geometric tree metrics, BOSQUE, 33(3):313–319

Maps of tree genera are useful in applications including forest inventory, urban planning, and the maintenance of utility transmission line infrastructure. We present a case study of using high density airborne LiDAR data for tree genera mapping along the right of way (ROW) of a utility transmission line corridor. Our goal was to identify single trees that showed or posed potential threats to transmission line infrastructure. Using the three dimensional mapping capability of LiDAR, we derived tree metrics that are related to the geometry of the trees (tree forms). For example, the dominant growth direction of trees is useful in identifying trees that are leaning towards transmission lines. We also derived other geometric indices that are useful in determining tree genera; these metrics included their height, crown shape, size, and branching structures. Our pilot study was situated north of Thessalon, Ontario, Canada along a major utility corridor ROW and surrounding woodlots. The geometric features used for general classification could be categorized into five broad categories related to: 1) lines, 2) clusters, 3) volumes, 4) 3D buffers of points, and 5) overall tree shape that provide parameters as an input for the Random Forest classifier.