Yury Brodskiy

CV
h-index41
6papers
121citations
Novelty31%
AI Score40

6 Papers

38.5ROMay 14Code
SeaVis: Modeling and Control of a Remotely Operated Towed Vehicle for Seabed Visualization and Mapping

Abdelhakim Amer, Aske Alstrup, Frederik Rasmussen et al.

High-resolution seafloor mapping necessitates stable and precise positioning for underwater robots. This paper introduces a novel mathematical model for SeaVis remotely operated towed vehicles (ROTVs) and develops a gain-scheduled linear-quadratic regulator (LQR) for robust depth and attitude control. We validate the approach in a high-fidelity simulation, benchmarking the LQR against a conventional PID controller over a challenging seabed profile. The presented results demonstrate the LQR's superior performance, with significantly enhanced robustness to disturbances, greater control efficiency, and substantially reduced flap actuation. The gain scheduling also confirms the controller's effectiveness across the full operational velocity range. The complete simulation environment and controller are open-sourced.

CVDec 19, 2023Code
Loss it right: Euclidean and Riemannian Metrics in Learning-based Visual Odometry

Olaya Álvarez-Tuñón, Yury Brodskiy, Erdal Kayacan

This paper overviews different pose representations and metric functions in visual odometry (VO) networks. The performance of VO networks heavily relies on how their architecture encodes the information. The choice of pose representation and loss function significantly impacts network convergence and generalization. We investigate these factors in the VO network DeepVO by implementing loss functions based on Euler, quaternion, and chordal distance and analyzing their influence on performance. The results of this study provide insights into how loss functions affect the designing of efficient and accurate VO networks for camera motion estimation. The experiments illustrate that a distance that complies with the mathematical requirements of a metric, such as the chordal distance, provides better generalization and faster convergence. The code for the experiments can be found at https://github.com/remaro-network/Loss_VO_right

LGApr 28, 2025Code
Modelling of Underwater Vehicles using Physics-Informed Neural Networks with Control

Abdelhakim Amer, David Felsager, Yury Brodskiy et al.

Physics-informed neural networks (PINNs) integrate physical laws with data-driven models to improve generalization and sample efficiency. This work introduces an open-source implementation of the Physics-Informed Neural Network with Control (PINC) framework, designed to model the dynamics of an underwater vehicle. Using initial states, control actions, and time inputs, PINC extends PINNs to enable physically consistent transitions beyond the training domain. Various PINC configurations are tested, including differing loss functions, gradient-weighting schemes, and hyperparameters. Validation on a simulated underwater vehicle demonstrates more accurate long-horizon predictions compared to a non-physics-informed baseline

ROMar 4, 2025
Monocular visual simultaneous localization and mapping: (r)evolution from geometry to deep learning-based pipelines

Olaya Alvarez-Tunon, Yury Brodskiy, Erdal Kayacan

With the rise of deep learning, there is a fundamental change in visual SLAM algorithms toward developing different modules trained as end-to-end pipelines. However, regardless of the implementation domain, visual SLAM's performance is subject to diverse environmental challenges, such as dynamic elements in outdoor environments, harsh imaging conditions in underwater environments, or blurriness in high-speed setups. These environmental challenges need to be identified to study the real-world viability of SLAM implementations. Motivated by the aforementioned challenges, this paper surveys the current state of visual SLAM algorithms according to the two main frameworks: geometry-based and learning-based SLAM. First, we introduce a general formulation of the SLAM pipeline that includes most of the implementations in the literature. Second, those implementations are classified and surveyed for geometry and learning-based SLAM. After that, environment-specific challenges are formulated to enable experimental evaluation of the resilience of different visual SLAM classes to varying imaging conditions. We address two significant issues in surveying visual SLAM, providing (1) a consistent classification of visual SLAM pipelines and (2) a robust evaluation of their performance under different deployment conditions. Finally, we give our take on future opportunities for visual SLAM implementations.

CVMar 20, 2024
Uncertainty Driven Active Learning for Image Segmentation in Underwater Inspection

Luiza Ribeiro Marnet, Yury Brodskiy, Stella Grasshof et al.

Active learning aims to select the minimum amount of data to train a model that performs similarly to a model trained with the entire dataset. We study the potential of active learning for image segmentation in underwater infrastructure inspection tasks, where large amounts of data are typically collected. The pipeline inspection images are usually semantically repetitive but with great variations in quality. We use mutual information as the acquisition function, calculated using Monte Carlo dropout. To assess the effectiveness of the framework, DenseNet and HyperSeg are trained with the CamVid dataset using active learning. In addition, HyperSeg is trained with a pipeline inspection dataset of over 50,000 images. For the pipeline dataset, HyperSeg with active learning achieved 67.5% meanIoU using 12.5% of the data, and 61.4% with the same amount of randomly selected images. This shows that using active learning for segmentation models in underwater inspection tasks can lower the cost significantly.

CVJul 9, 2019
UnsuperPoint: End-to-end Unsupervised Interest Point Detector and Descriptor

Peter Hviid Christiansen, Mikkel Fly Kragh, Yury Brodskiy et al.

It is hard to create consistent ground truth data for interest points in natural images, since interest points are hard to define clearly and consistently for a human annotator. This makes interest point detectors non-trivial to build. In this work, we introduce an unsupervised deep learning-based interest point detector and descriptor. Using a self-supervised approach, we utilize a siamese network and a novel loss function that enables interest point scores and positions to be learned automatically. The resulting interest point detector and descriptor is UnsuperPoint. We use regression of point positions to 1) make UnsuperPoint end-to-end trainable and 2) to incorporate non-maximum suppression in the model. Unlike most trainable detectors, it requires no generation of pseudo ground truth points, no structure-from-motion-generated representations and the model is learned from only one round of training. Furthermore, we introduce a novel loss function to regularize network predictions to be uniformly distributed. UnsuperPoint runs in real-time with 323 frames per second (fps) at a resolution of $224\times320$ and 90 fps at $480\times640$. It is comparable or better than state-of-the-art performance when measured for speed, repeatability, localization, matching score and homography estimation on the HPatch dataset.