MULTIAQUA: A multimodal maritime dataset and robust training strategies for multimodal semantic segmentation
This work addresses the challenge of scene interpretation for unmanned surface vehicles in varied visual conditions, though it is incremental as it builds on existing multimodal methods with new data and training approaches.
The authors tackled the problem of poor visibility in maritime environments by introducing a multimodal dataset (MULTIAQUA) and robust training strategies, enabling deep neural networks to achieve reliable performance in near-complete darkness using only daytime images.
Unmanned surface vehicles can encounter a number of varied visual circumstances during operation, some of which can be very difficult to interpret. While most cases can be solved only using color camera images, some weather and lighting conditions require additional information. To expand the available maritime data, we present a novel multimodal maritime dataset MULTIAQUA (Multimodal Aquatic Dataset). Our dataset contains synchronized, calibrated and annotated data captured by sensors of different modalities, such as RGB, thermal, IR, LIDAR, etc. The dataset is aimed at developing supervised methods that can extract useful information from these modalities in order to provide a high quality of scene interpretation regardless of potentially poor visibility conditions. To illustrate the benefits of the proposed dataset, we evaluate several multimodal methods on our difficult nighttime test set. We present training approaches that enable multimodal methods to be trained in a more robust way, thus enabling them to retain reliable performance even in near-complete darkness. Our approach allows for training a robust deep neural network only using daytime images, thus significantly simplifying data acquisition, annotation, and the training process.