Forest Inspection Dataset for Aerial Semantic Segmentation and Depth Estimation
This work addresses the problem of insufficient data for deep learning in forest monitoring, which is incremental as it builds on existing methods by providing a new dataset.
The authors tackled the lack of detailed annotated forest imagery for deforestation assessment by introducing a new large aerial dataset with real and virtual recordings, including semantic segmentation labels and depth maps, and found that training on diverse scenarios yields the best results, with specific performance metrics not provided.
Humans use UAVs to monitor changes in forest environments since they are lightweight and provide a large variety of surveillance data. However, their information does not present enough details for understanding the scene which is needed to assess the degree of deforestation. Deep learning algorithms must be trained on large amounts of data to output accurate interpretations, but ground truth recordings of annotated forest imagery are not available. To solve this problem, we introduce a new large aerial dataset for forest inspection which contains both real-world and virtual recordings of natural environments, with densely annotated semantic segmentation labels and depth maps, taken in different illumination conditions, at various altitudes and recording angles. We test the performance of two multi-scale neural networks for solving the semantic segmentation task (HRNet and PointFlow network), studying the impact of the various acquisition conditions and the capabilities of transfer learning from virtual to real data. Our results showcase that the best results are obtained when the training is done on a dataset containing a large variety of scenarios, rather than separating the data into specific categories. We also develop a framework to assess the deforestation degree of an area.