The automatic creation and update of tree inventory provides an opportunity to better manage tree as a precious resource in both natural and urban environment. Detecting and classifying individual tree species automatically is still one of the most challenging problems to be resolved. The goal of this project is to automatically detect 3D bounding boxes for individual trees from airborne LiDAR data using state-of-the art single-stage 3D object detection methods. According to Guo et al. (2019), there are five broad categories for 3D object detection methods, 1) Multi-view methods, 2) Segmentation-based methods, 3) Frustum-based methods, 4) Point cloud-based methods, and 5) BEV (Bird eye view) -based methods. The Deep Convolution Neural Network (DCNN) we are choosing fall into point cloud-based method using point cloud directly as input. Our study area is at the York University Keele Campus, Toronto, Ontario, Canada. We acquired three sets of data for the project 1) Airborne LiDAR 2) Mobile LiDAR and 3) field data. We acquired two sets of LiDAR data during September 2018, the ground (mobile) data and the airborne data. The ground data collection was carried out by a mobile mapping system consist of Lynx HS600 (Teledyne Optech, Toronto, Ontario) with and Ladybug 360° camera with six integrated cameras on 8th September 2018. The airborne mission took place on 23rd September 2018 with two ALS sensors Galaxy-PRIME and Titan (Teledyne Optech, Toronto, Ontario). The average flying height for the mission was 1829m (6000ft) agl. For the field data, 5717 trees on campus were recorded with species name, tree crown width [m], and locations obtained by a handheld GPS.
Training data is generated semi-automatically with a marker-controlled watershed segmentation method (Markan, 2018) to first over generate detected local maximums (treetops) and watersheds (tree crown boundaries). By merging the representative watershed candidates, we produced 5717 tree crown boundaries. One of the advantages of this method over a simple marker-controlled watershed segmentation is that this method will allow delineated tree crowns to overlap which is common even in the urban environment. Ground truth bounding boxes are then generated from each 2D watersheds and the height of individual trees. While current 3D object detection methods focus on the detection of vehicle and pedestrians for autonomous driving applications, we aim to test the applicability of existing methods on tree objects. Current state-of-the-art methods we are considering includes VoxelNet (Zhou and Tuzel, 2018), SECOND (Yan et al., 2018), Point pillars (Lang et al., 2018), and Part-A^2 (Shi et al., 2019). Regarding 3D object detection modules, tree point cloud data are sampled and processed into representations suitable for deep convolution neural networks. Several representations as 3D-voxelization-based representation and bird’s-eye-view-based representation are being explored. The representations will be passed through a trainable neural network to generate predicted 3D bounding box for trees. Improving detection accuracy not only affect classification results and will also improve existing tree locations that we traditionally recorded by handheld GPS or estimated from google street view imagery.