CVLGOct 6, 2025

BenthiCat: An opti-acoustic dataset for advancing benthic classification and habitat mapping

arXiv:2510.04876v23 citationsh-index: 3Has Code
AI Analysis

This provides a standardized benchmark for researchers in marine ecology and autonomous underwater vehicles, addressing a domain-specific data limitation, though it is incremental as it focuses on dataset creation rather than novel algorithmic breakthroughs.

The paper tackles the scarcity of annotated datasets for benthic habitat mapping by introducing a multi-modal dataset with about a million side-scan sonar tiles, including 36,000 manually annotated ones, along with bathymetric maps and co-registered optical images, to support supervised and self-supervised learning for underwater classification.

Benthic habitat mapping is fundamental for understanding marine ecosystems, guiding conservation efforts, and supporting sustainable resource management. Yet, the scarcity of large, annotated datasets limits the development and benchmarking of machine learning models in this domain. This paper introduces a thorough multi-modal dataset, comprising about a million side-scan sonar (SSS) tiles collected along the coast of Catalonia (Spain), complemented by bathymetric maps and a set of co-registered optical images from targeted surveys using an autonomous underwater vehicle (AUV). Approximately \num{36000} of the SSS tiles have been manually annotated with segmentation masks to enable supervised fine-tuning of classification models. All the raw sensor data, together with mosaics, are also released to support further exploration and algorithm development. To address challenges in multi-sensor data fusion for AUVs, we spatially associate optical images with corresponding SSS tiles, facilitating self-supervised, cross-modal representation learning. Accompanying open-source preprocessing and annotation tools are provided to enhance accessibility and encourage research. This resource aims to establish a standardized benchmark for underwater habitat mapping, promoting advancements in autonomous seafloor classification and multi-sensor integration.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes