SimuShips -- A High Resolution Simulation Dataset for Ship Detection with Precise Annotations
This work addresses the data scarcity problem for maritime obstacle detection, which is crucial for autonomous vessels, but it is incremental as it builds on existing simulation and CNN methods.
The authors tackled the challenge of limited domain-specific datasets for obstacle detection in autonomous maritime surface vessels by introducing SimuShips, a high-resolution simulation dataset with precise annotations, and found that combining real and simulated images improved recall by 2.9% in experiments with YOLOv5.
Obstacle detection is a fundamental capability of an autonomous maritime surface vessel (AMSV). State-of-the-art obstacle detection algorithms are based on convolutional neural networks (CNNs). While CNNs provide higher detection accuracy and fast detection speed, they require enormous amounts of data for their training. In particular, the availability of domain-specific datasets is a challenge for obstacle detection. The difficulty in conducting onsite experiments limits the collection of maritime datasets. Owing to the logistic cost of conducting on-site operations, simulation tools provide a safe and cost-efficient alternative for data collection. In this work, we introduce SimuShips, a publicly available simulation-based dataset for maritime environments. Our dataset consists of 9471 high-resolution (1920x1080) images which include a wide range of obstacle types, atmospheric and illumination conditions along with occlusion, scale and visible proportion variations. We provide annotations in the form of bounding boxes. In addition, we conduct experiments with YOLOv5 to test the viability of simulation data. Our experiments indicate that the combination of real and simulated images improves the recall for all classes by 2.9%.