MESSI: A Multi-Elevation Semantic Segmentation Image Dataset of an Urban Environment
This provides a new benchmark dataset for semantic segmentation in drone-based urban applications, though it is incremental as it focuses on data collection rather than novel methods.
The paper introduces the MESSI dataset, comprising 2525 drone-captured images from various altitudes in dense urban environments, to investigate depth effects on semantic segmentation, and demonstrates its use by training neural networks for segmentation tasks.
This paper presents a Multi-Elevation Semantic Segmentation Image (MESSI) dataset comprising 2525 images taken by a drone flying over dense urban environments. MESSI is unique in two main features. First, it contains images from various altitudes, allowing us to investigate the effect of depth on semantic segmentation. Second, it includes images taken from several different urban regions (at different altitudes). This is important since the variety covers the visual richness captured by a drone's 3D flight, performing horizontal and vertical maneuvers. MESSI contains images annotated with location, orientation, and the camera's intrinsic parameters and can be used to train a deep neural network for semantic segmentation or other applications of interest (e.g., localization, navigation, and tracking). This paper describes the dataset and provides annotation details. It also explains how semantic segmentation was performed using several neural network models and shows several relevant statistics. MESSI will be published in the public domain to serve as an evaluation benchmark for semantic segmentation using images captured by a drone or similar vehicle flying over a dense urban environment.