Training point-based deep learning networks for forest segmentation with synthetic data
This work addresses the challenge of expensive and inaccessible data collection for forestry applications using remote sensing, though it is incremental as it applies existing methods to a new domain with synthetic data.
The authors tackled the problem of limited real point cloud data for forest segmentation by developing a realistic simulator to generate synthetic forest scenes, and found that synthetic data can feasibly train deep learning networks to classify real forest point clouds, with both the simulator and datasets released.
Remote sensing through unmanned aerial systems (UAS) has been increasing in forestry in recent years, along with using machine learning for data processing. Deep learning architectures, extensively applied in natural language and image processing, have recently been extended to the point cloud domain. However, the availability of point cloud datasets for training and testing remains limited. Creating forested environment point cloud datasets is expensive, requires high-precision sensors, and is time-consuming as manual point classification is required. Moreover, forest areas could be inaccessible or dangerous for humans, further complicating data collection. Then, a question arises whether it is possible to use synthetic data to train deep learning networks without the need to rely on large volumes of real forest data. To answer this question, we developed a realistic simulator that procedurally generates synthetic forest scenes. Thanks to this, we have conducted a comparative study of different state-of-the-art point-based deep learning networks for forest segmentation. Using created datasets, we determined the feasibility of using synthetic data to train deep learning networks to classify point clouds from real forest datasets. Both the simulator and the datasets are released as part of this work.