MariData: One-Step Unpaired Image Translation for Maritime Environments
For researchers and practitioners in autonomous maritime navigation, this work provides an efficient data synthesis pipeline to overcome the data scarcity bottleneck, though the night domain results indicate limitations.
This paper addresses the scarcity of diverse training data for maritime autonomous surface ships by introducing a one-step unpaired image translation framework (CycleGAN-turbo with zero-convolution skip connections) that preserves fine structural details of small navigational objects. The method synthesizes realistic atmospheric conditions (foggy, sunset, night) from day images, with qualitative evaluations showing strong structural retention for foggy and sunset domains, while night translation faces semantic hallucination challenges.
The development on robust perception systems for Maritime Autonomous Surface Ships (MASS) is heavily constrained by the scarcity of diverse training data, particularly for adverse weather and low-light conditions. Because collecting paired images in dynamic maritime environments is physically impossible, synthetic data generation via unpaired image-to-image translation offers a critical solution. However, existing generative models suffer from failing to preserve the fine structural details of small navigational objects due to latent compression bottlenecks. In this paper, we introduce a framework for generating synthetic maritime data using CycleGAN-turbo, a one-step unpaired translation architecture. By incorporating zero-convolution skip connections to bypass the Variational Autoencoder (VAE) bottleneck, our approach explicitly preserves small object details (e.g., distant vessels and sea marks) during translation. We compiled a dataset of 7,000 maritime images to train and evaluate models for Day-to-Foggy, Day-to-Sunset, and Day-to-Night domain translations. Qualitative evaluations and variable-strength inference studies demonstrate that our method effectively synthesizes realistic atmospheric conditions while maintaining the underlying semantic structure of the scene. The Day-to-Foggy and Day-to-Sunset models exhibit great structural retention, whereas the Day-to-Night model highlights the challenge of semantic hallucination, such as generating artificial coastal lights, induced by unbalanced training distributions. Ultimately, this work establishes an efficient, structure-aware data synthesis pipeline that directly addresses the data scarcity bottleneck in autonomous maritime navigation.