ROJun 14, 2023
Investigation of the Challenges of Underwater-Visual-Monocular-SLAMMichele Grimaldi, David Nakath, Mengkun She et al.
In this paper, we present a comprehensive investigation of the challenges of Monocular Visual Simultaneous Localization and Mapping (vSLAM) methods for underwater robots. While significant progress has been made in state estimation methods that utilize visual data in the past decade, most evaluations have been limited to controlled indoor and urban environments, where impressive performance was demonstrated. However, these techniques have not been extensively tested in extremely challenging conditions, such as underwater scenarios where factors such as water and light conditions, robot path, and depth can greatly impact algorithm performance. Hence, our evaluation is conducted in real-world AUV scenarios as well as laboratory settings which provide precise external reference. A focus is laid on understanding the impact of environmental conditions, such as optical properties of the water and illumination scenarios, on the performance of monocular vSLAM methods. To this end, we first show that all methods perform very well in in-air settings and subsequently show the degradation of their performance in challenging underwater environments. The final goal of this study is to identify techniques that can improve accuracy and robustness of SLAM methods in such conditions. To achieve this goal, we investigate the potential of image enhancement techniques to improve the quality of input images used by the SLAM methods, specifically in low visibility and extreme lighting scenarios in scattering media. We present a first evaluation on calibration maneuvers and simple image restoration techniques to determine their ability to enable or enhance the performance of monocular SLAM methods in underwater environments.
CVAug 11, 2023
Semihierarchical Reconstruction and Weak-area Revisiting for Robotic Visual Seafloor MappingMengkun She, Yifan Song, David Nakath et al.
Despite impressive results achieved by many on-land visual mapping algorithms in the recent decades, transferring these methods from land to the deep sea remains a challenge due to harsh environmental conditions. Images captured by autonomous underwater vehicles (AUVs), equipped with high-resolution cameras and artificial illumination systems, often suffer from heterogeneous illumination and quality degradation caused by attenuation and scattering, on top of refraction of light rays. These challenges often result in the failure of on-land SLAM approaches when applied underwater or cause SfM approaches to exhibit drifting or omit challenging images. Consequently, this leads to gaps, jumps, or weakly reconstructed areas. In this work, we present a navigation-aided hierarchical reconstruction approach to facilitate the automated robotic 3D reconstruction of hectares of seafloor. Our hierarchical approach combines the advantages of SLAM and global SfM that is much more efficient than incremental SfM, while ensuring the completeness and consistency of the global map. This is achieved through identifying and revisiting problematic or weakly reconstructed areas, avoiding to omit images and making better use of limited dive time. The proposed system has been extensively tested and evaluated during several research cruises, demonstrating its robustness and practicality in real-world conditions.
30.2CVApr 21
BALTIC: A Benchmark and Cross-Domain Strategy for 3D Reconstruction Across Air and Underwater Domains Under Varying IlluminationMichele Grimaldi, David Nakath, Oscar Pizarro et al.
Robust 3D reconstruction across varying environmental conditions remains a critical challenge for robotic perception, particularly when transitioning between air and water. To address this, we introduce BALTIC, a controlled benchmark designed to systematically evaluate modern 3D reconstruction methods under variations in medium and lighting. The benchmark comprises 13 datasets spanning two media (air and water) and three lighting conditions (ambient, artificial, and mixed), with additional variations in motion type, scanning pattern, and initialization trajectory, resulting in a diverse set of sequences. Our experimental setup features a custom water tank equipped with a monocular camera and an HTC Vive tracker, enabling accurate ground-truth pose estimation. We further investigate cross-domain reconstruction by augmenting underwater image sequences with a small number of in-air views captured under similar lighting conditions. We evaluate Structure-from-Motion reconstruction using COLMAP in terms of both trajectory accuracy and scene geometry, and use these reconstructions as input to Neural Radiance Fields and 3D Gaussian Splatting methods. The resulting models are assessed against ground-truth trajectories and in-air references, while rendered outputs are compared using perceptual and photometric metrics. Additionally, we perform a color restoration analysis to evaluate radiometric consistency across domains. Our results show that under controlled, texture-consistent conditions, Gaussian Splatting with simple preprocessing (e.g., white balance correction) can achieve performance comparable to specialized underwater methods, although its robustness decreases in more complex and heterogeneous real-world environments
CVMar 13, 2024Code
Refractive COLMAP: Refractive Structure-from-Motion RevisitedMengkun She, Felix Seegräber, David Nakath et al.
In this paper, we present a complete refractive Structure-from-Motion (RSfM) framework for underwater 3D reconstruction using refractive camera setups (for both, flat- and dome-port underwater housings). Despite notable achievements in refractive multi-view geometry over the past decade, a robust, complete and publicly available solution for such tasks is not available at present, and often practical applications have to resort to approximating refraction effects by the intrinsic (distortion) parameters of a pinhole camera model. To fill this gap, we have integrated refraction considerations throughout the entire SfM process within the state-of-the-art, open-source SfM framework COLMAP. Numerical simulations and reconstruction results on synthetically generated but photo-realistic images with ground truth validate that enabling refraction does not compromise accuracy or robustness as compared to in-air reconstructions. Finally, we demonstrate the capability of our approach for large-scale refractive scenarios using a dataset consisting of nearly 6000 images. The implementation is released as open-source at: https://cau-git.rz.uni-kiel.de/inf-ag-koeser/colmap_underwater.
CVSep 24, 2025
Optical Ocean Recipes: Creating Realistic Datasets to Facilitate Underwater Vision ResearchPatricia Schöntag, David Nakath, Judith Fischer et al.
The development and evaluation of machine vision in underwater environments remains challenging, often relying on trial-and-error-based testing tailored to specific applications. This is partly due to the lack of controlled, ground-truthed testing environments that account for the optical challenges, such as color distortion from spectrally variant light attenuation, reduced contrast and blur from backscatter and volume scattering, and dynamic light patterns from natural or artificial illumination. Additionally, the appearance of ocean water in images varies significantly across regions, depths, and seasons. However, most machine vision evaluations are conducted under specific optical water types and imaging conditions, therefore often lack generalizability. Exhaustive testing across diverse open-water scenarios is technically impractical. To address this, we introduce the \textit{Optical Ocean Recipes}, a framework for creating realistic datasets under controlled underwater conditions. Unlike synthetic or open-water data, these recipes, using calibrated color and scattering additives, enable repeatable and controlled testing of the impact of water composition on image appearance. Hence, this provides a unique framework for analyzing machine vision in realistic, yet controlled underwater scenarios. The controlled environment enables the creation of ground-truth data for a range of vision tasks, including water parameter estimation, image restoration, segmentation, visual SLAM, and underwater image synthesis. We provide a demonstration dataset generated using the Optical Ocean Recipes and briefly demonstrate the use of our system for two underwater vision tasks. The dataset and evaluation code will be made available.
CVApr 14, 2025
Relative Illumination Fields: Learning Medium and Light Independent Underwater ScenesMengkun She, Felix Seegräber, David Nakath et al.
We address the challenge of constructing a consistent and photorealistic Neural Radiance Field in inhomogeneously illuminated, scattering environments with unknown, co-moving light sources. While most existing works on underwater scene representation focus on a static homogeneous illumination, limited attention has been paid to scenarios such as when a robot explores water deeper than a few tens of meters, where sunlight becomes insufficient. To address this, we propose a novel illumination field locally attached to the camera, enabling the capture of uneven lighting effects within the viewing frustum. We combine this with a volumetric medium representation to an overall method that effectively handles interaction between dynamic illumination field and static scattering medium. Evaluation results demonstrate the effectiveness and flexibility of our approach.
CVDec 21, 2023
Visual Tomography: Physically Faithful Volumetric Models of Partially Translucent ObjectsDavid Nakath, Xiangyu Weng, Mengkun She et al.
When created faithfully from real-world data, Digital 3D representations of objects can be useful for human or computer-assisted analysis. Such models can also serve for generating training data for machine learning approaches in settings where data is difficult to obtain or where too few training data exists, e.g. by providing novel views or images in varying conditions. While the vast amount of visual 3D reconstruction approaches focus on non-physical models, textured object surfaces or shapes, in this contribution we propose a volumetric reconstruction approach that obtains a physical model including the interior of partially translucent objects such as plankton or insects. Our technique photographs the object under different poses in front of a bright white light source and computes absorption and scattering per voxel. It can be interpreted as visual tomography that we solve by inverse raytracing. We additionally suggest a method to convert non-physical NeRF media into a physically-based volumetric grid for initialization and illustrate the usefulness of the approach using two real-world plankton validation sets, the lab-scanned models being finally also relighted and virtually submerged in a scenario with augmented medium and illumination conditions. Please visit the project homepage at www.marine.informatik.uni-kiel.de/go/vito
ROMay 7, 2023
Design, Implementation and Evaluation of an External Pose-Tracking System for Underwater CamerasBirger Winkel, David Nakath, Felix Woelk et al.
In order to advance underwater computer vision and robotics from lab environments and clear water scenarios to the deep dark ocean or murky coastal waters, representative benchmarks and realistic datasets with ground truth information are required. In particular, determining the camera pose is essential for many underwater robotic or photogrammetric applications and known ground truth is mandatory to evaluate the performance of e.g., simultaneous localization and mapping approaches in such extreme environments. This paper presents the conception, calibration and implementation of an external reference system for determining the underwater camera pose in real-time. The approach, based on an HTC Vive tracking system in air, calculates the underwater camera pose by fusing the poses of two controllers tracked above the water surface of a tank. It is shown that the mean deviation of this approach to an optical marker based reference in air is less than 3 mm and 0.3 deg. Finally, the usability of the system for underwater applications is demonstrated.
CVAug 14, 2021
Refractive Geometry for Underwater DomesMengkun She, David Nakath, Yifan Song et al.
Underwater cameras are typically placed behind glass windows to protect them from the water. Spherical glass, a dome port, is well suited for high water pressures at great depth, allows for a large field of view, and avoids refraction if a pinhole camera is positioned exactly at the sphere's center. Adjusting a real lens perfectly to the dome center is a challenging task, both in terms of how to actually guide the centering process (e.g. visual servoing) and how to measure the alignment quality, but also, how to mechanically perform the alignment. Consequently, such systems are prone to being decentered by some offset, leading to challenging refraction patterns at the sphere that invalidate the pinhole camera model. We show that the overall camera system becomes an axial camera, even for thick domes as used for deep sea exploration and provide a non-iterative way to compute the center of refraction without requiring knowledge of exact air, glass or water properties. We also analyze the refractive geometry at the sphere, looking at effects such as forward- vs. backward decentering, iso-refraction curves and obtain a 6th-degree polynomial equation for forward projection of 3D points in thin domes. We then propose a pure underwater calibration procedure to estimate the decentering from multiple images. This estimate can either be used during adjustment to guide the mechanical position of the lens, or can be considered in photogrammetric underwater applications.
CVJun 27, 2020
Deep Sea Robotic Imaging SimulatorYifan Song, David Nakath, Mengkun She et al.
Nowadays underwater vision systems are being widely applied in ocean research. However, the largest portion of the ocean - the deep sea - still remains mostly unexplored. Only relatively few image sets have been taken from the deep sea due to the physical limitations caused by technical challenges and enormous costs. Deep sea images are very different from the images taken in shallow waters and this area did not get much attention from the community. The shortage of deep sea images and the corresponding ground truth data for evaluation and training is becoming a bottleneck for the development of underwater computer vision methods. Thus, this paper presents a physical model-based image simulation solution, which uses an in-air texture and depth information as inputs, to generate underwater image sequences taken by robots in deep ocean scenarios. Different from shallow water conditions, artificial illumination plays a vital role in deep sea image formation as it strongly affects the scene appearance. Our radiometric image formation model considers both attenuation and scattering effects with co-moving spotlights in the dark. By detailed analysis and evaluation of the underwater image formation model, we propose a 3D lookup table structure in combination with a novel rendering strategy to improve simulation performance. This enables us to integrate an interactive deep sea robotic vision simulation in the Unmanned Underwater Vehicles simulator. To inspire further deep sea vision research by the community, we will release the source code of our deep sea image converter to the public.
CVJun 27, 2020
Light Pose Calibration for Camera-light Vision SystemsYifan Song, Furkan Elibol, Mengkun She et al.
Illuminating a scene with artificial light is a prerequisite for seeing in dark environments. However, nonuniform and dynamic illumination can deteriorate or even break computer vision approaches, for instance when operating a robot with headlights in the darkness. This paper presents a novel light calibration approach by taking multi-view and -distance images of a reference plane in order to provide pose information of the employed light sources to the computer vision system. By following a physical light propagation approach, under consideration of energy preservation, the estimation of light poses is solved by minimizing of the differences between real and rendered pixel intensities. During the evaluation we show the robustness and consistency of this method by statistically analyzing the light pose estimation results with different setups. Although the results are demonstrated using a rotationally-symmetric non-isotropic light, the method is suited also for non-symmetric lights.