h-index8
11papers
35citations
Novelty44%
AI Score50

11 Papers

CVAug 23, 2022
In-Air Imaging Sonar Sensor Network with Real-Time Processing Using GPUs

Wouter Jansen, Dennis Laurijssen, Robin Kerstens et al.

For autonomous navigation and robotic applications, sensing the environment correctly is crucial. Many sensing modalities for this purpose exist. In recent years, one such modality that is being used is in-air imaging sonar. It is ideal in complex environments with rough conditions such as dust or fog. However, like with most sensing modalities, to sense the full environment around the mobile platform, multiple such sensors are needed to capture the full 360-degree range. Currently the processing algorithms used to create this data are insufficient to do so for multiple sensors at a reasonably fast update rate. Furthermore, a flexible and robust framework is needed to easily implement multiple imaging sonar sensors into any setup and serve multiple application types for the data. In this paper we present a sensor network framework designed for this novel sensing modality. Furthermore, an implementation of the processing algorithm on a Graphics Processing Unit is proposed to potentially decrease the computing time to allow for real-time processing of one or more imaging sonar sensors at a sufficiently high update rate.

0.9CVMar 30
Intelligent Road Condition Monitoring using 3D In-Air SONAR Sensing

Amber Cassimon, Robin Kerstens, Walter Daems et al.

In this paper, we investigate the capabilities of in-air 3D SONAR sensors for the monitoring of road surface conditions. Concretely, we consider two applications: Road material classification and Road damage detection and classification. While such tasks can be performed with other sensor modalities, such as camera sensors and LiDAR sensors, these sensor modalities tend to fail in harsh sensing conditions, such as heavy rain, smoke or fog. By using a sensing modality that is robust to such interference, we enable the creation of opportunistic sensing applications, where vehicles performing other tasks (garbage collection, mail delivery, etc.) can also be used to monitor the condition of the road. For these tasks, we use a single dataset, in which different types of damages are annotated, with labels including the material of the road surface. In the material classification task, we differentiate between three different road materials: Asphalt, Concrete and Element roads. In the damage detection and classification task, we determine if there is damage, and what type of damage (independent of material type), without localizing the damage. We are succesful in determining the road surface type from SONAR sensor data, with F1 scores approaching 90% on the test set, but find that for the detection of damages performace lags, with F1 score around 75%. From this, we conclude that SONAR sensing is a promising modality to include in opportunistic sensing-based pavement management systems, but that further research is needed to reach the desired accuracy.

CVFeb 12
LLM-Driven 3D Scene Generation of Agricultural Simulation Environments

Arafa Yoncalik, Wouter Jansen, Nico Huebel et al.

Procedural generation techniques in 3D rendering engines have revolutionized the creation of complex environments, reducing reliance on manual design. Recent approaches using Large Language Models (LLMs) for 3D scene generation show promise but often lack domain-specific reasoning, verification mechanisms, and modular design. These limitations lead to reduced control and poor scalability. This paper investigates the use of LLMs to generate agricultural synthetic simulation environments from natural language prompts, specifically to address the limitations of lacking domain-specific reasoning, verification mechanisms, and modular design. A modular multi-LLM pipeline was developed, integrating 3D asset retrieval, domain knowledge injection, and code generation for the Unreal rendering engine using its API. This results in a 3D environment with realistic planting layouts and environmental context, all based on the input prompt and the domain knowledge. To enhance accuracy and scalability, the system employs a hybrid strategy combining LLM optimization techniques such as few-shot prompting, Retrieval-Augmented Generation (RAG), finetuning, and validation. Unlike monolithic models, the modular architecture enables structured data handling, intermediate verification, and flexible expansion. The system was evaluated using structured prompts and semantic accuracy metrics. A user study assessed realism and familiarity against real-world images, while an expert comparison demonstrated significant time savings over manual scene design. The results confirm the effectiveness of multi-LLM pipelines in automating domain-specific 3D scene generation with improved reliability and precision. Future work will explore expanding the asset hierarchy, incorporating real-time generation, and adapting the pipeline to other simulation domains beyond agriculture.

ROJun 27, 2025Code
ASVSim (AirSim for Surface Vehicles): A High-Fidelity Simulation Framework for Autonomous Surface Vehicle Research

Bavo Lesy, Siemen Herremans, Robin Kerstens et al.

The transport industry has recently shown significant interest in unmanned surface vehicles (USVs), specifically for port and inland waterway transport. These systems can improve operational efficiency and safety, which is especially relevant in the European Union, where initiatives such as the Green Deal are driving a shift towards increased use of inland waterways. At the same time, a shortage of qualified personnel is accelerating the adoption of autonomous solutions. However, there is a notable lack of open-source, high-fidelity simulation frameworks and datasets for developing and evaluating such solutions. To address these challenges, we introduce AirSim For Surface Vehicles (ASVSim), an open-source simulation framework specifically designed for autonomous shipping research in inland and port environments. The framework combines simulated vessel dynamics with marine sensor simulation capabilities, including radar and camera systems and supports the generation of synthetic datasets for training computer vision models and reinforcement learning agents. Built upon Cosys-AirSim, ASVSim provides a comprehensive platform for developing autonomous navigation algorithms and generating synthetic datasets. The simulator supports research of both traditional control methods and deep learning-based approaches. Through limited experiments, we demonstrate the potential of the simulator in these research areas. ASVSim is provided as an open-source project under the MIT license, making autonomous navigation research accessible to a larger part of the ocean engineering community.

CVDec 18, 2025
Predictive Modeling of Maritime Radar Data Using Transformer Architecture

Bjorna Qesaraku, Jan Steckel

Maritime autonomous systems require robust predictive capabilities to anticipate vessel motion and environmental dynamics. While transformer architectures have revolutionized AIS-based trajectory prediction and demonstrated feasibility for sonar frame forecasting, their application to maritime radar frame prediction remains unexplored, creating a critical gap given radar's all-weather reliability for navigation. This survey systematically reviews predictive modeling approaches relevant to maritime radar, with emphasis on transformer architectures for spatiotemporal sequence forecasting, where existing representative methods are analyzed according to data type, architecture, and prediction horizon. Our review shows that, while the literature has demonstrated transformer-based frame prediction for sonar sensing, no prior work addresses transformer-based maritime radar frame prediction, thereby defining a clear research gap and motivating a concrete research direction for future work in this area.

SPSep 8, 2025
Towards In-Air Ultrasonic QR Codes: Deep Learning for Classification of Passive Reflector Constellations

Wouter Jansen, Jan Steckel

In environments where visual sensors falter, in-air sonar provides a reliable alternative for autonomous systems. While previous research has successfully classified individual acoustic landmarks, this paper takes a step towards increasing information capacity by introducing reflector constellations as encoded tags. Our primary contribution is a multi-label Convolutional Neural Network (CNN) designed to simultaneously identify multiple, closely spaced reflectors from a single in-air 3D sonar measurement. Our initial findings on a small dataset confirm the feasibility of this approach, validating the ability to decode these complex acoustic patterns. Secondly, we investigated using adaptive beamforming with null-steering to isolate individual reflectors for single-label classification. Finally, we discuss the experimental results and limitations, offering key insights and future directions for developing acoustic landmark systems with significantly increased information entropy and their accurate and robust detection and classification.

CVSep 6, 2025
LiDAR-BIND-T: Improved and Temporally Consistent Sensor Modality Translation and Fusion for Robotic Applications

Niels Balemans, Ali Anwar, Jan Steckel et al.

This paper extends LiDAR-BIND, a modular multi-modal fusion framework that binds heterogeneous sensors (radar, sonar) to a LiDAR-defined latent space, with mechanisms that explicitly enforce temporal consistency. We introduce three contributions: (i) temporal embedding similarity that aligns consecutive latent representations, (ii) a motion-aligned transformation loss that matches displacement between predictions and ground truth LiDAR, and (iii) windowed temporal fusion using a specialised temporal module. We further update the model architecture to better preserve spatial structure. Evaluations on radar/sonar-to-LiDAR translation demonstrate improved temporal and spatial coherence, yielding lower absolute trajectory error and better occupancy map accuracy in Cartographer-based SLAM (Simultaneous Localisation and Mapping). We propose different metrics based on the Fréchet Video Motion Distance (FVMD) and a correlation-peak distance metric providing practical temporal quality indicators to evaluate SLAM performance. The proposed temporal LiDAR-BIND, or LiDAR-BIND-T, maintains modular modality fusion while substantially enhancing temporal stability, resulting in improved robustness and performance for downstream SLAM.

LGSep 4, 2025
Resource-Aware Neural Network Pruning Using Graph-based Reinforcement Learning

Dieter Balemans, Thomas Huybrechts, Jan Steckel et al.

This paper presents a novel approach to neural network pruning by integrating a graph-based observation space into an AutoML framework to address the limitations of existing methods. Traditional pruning approaches often depend on hand-crafted heuristics and local optimization perspectives, which can lead to suboptimal performance and inefficient pruning strategies. Our framework transforms the pruning process by introducing a graph representation of the target neural network that captures complete topological relationships between layers and channels, replacing the limited layer-wise observation space with a global view of network structure. The core innovations include a Graph Attention Network (GAT) encoder that processes the network's graph representation and generates a rich embedding. Additionally, for the action space we transition from continuous pruning ratios to fine-grained binary action spaces which enables the agent to learn optimal channel importance criteria directly from data, moving away from predefined scoring functions. These contributions are modelled within a Constrained Markov Decision Process (CMDP) framework, allowing the agent to make informed pruning decisions while adhering to resource constraints such as target compression rates. For this, we design a self-competition reward system that encourages the agent to outperform its previous best performance while satisfying the defined constraints. We demonstrate the effectiveness of our approach through extensive experiments on benchmark datasets including CIFAR-10, CIFAR-100, and ImageNet. The experiments show that our method consistently outperforms traditional pruning techniques, showing state-of-the-art results while learning task-specific pruning strategies that identify functionally redundant connections beyond simple weight magnitude considerations.

ASJun 13, 2024
Tool Wear Prediction in CNC Turning Operations using Ultrasonic Microphone Arrays and CNNs

Jan Steckel, Arne Aerts, Erik Verreycken et al.

This paper introduces a novel method for predicting tool wear in CNC turning operations, combining ultrasonic microphone arrays and convolutional neural networks (CNNs). High-frequency acoustic emissions between 0 kHz and 60 kHz are enhanced using beamforming techniques to improve the signal- to-noise ratio. The processed acoustic data is then analyzed by a CNN, which predicts the Remaining Useful Life (RUL) of cutting tools. Trained on data from 350 workpieces machined with a single carbide insert, the model can accurately predict the RUL of the carbide insert. Our results demonstrate the potential gained by integrating advanced ultrasonic sensors with deep learning for accurate predictive maintenance tasks in CNC machining.

ROMay 21, 2024
EchoPT: A Pretrained Transformer Architecture that Predicts 2D In-Air Sonar Images for Mobile Robotics

Jan Steckel, Wouter Jansen, Nico Huebel

The predictive brain hypothesis suggests that perception can be interpreted as the process of minimizing the error between predicted perception tokens generated by an internal world model and actual sensory input tokens. When implementing working examples of this hypothesis in the context of in-air sonar, significant difficulties arise due to the sparse nature of the reflection model that governs ultrasonic sensing. Despite these challenges, creating consistent world models using sonar data is crucial for implementing predictive processing of ultrasound data in robotics. In an effort to enable robust robot behavior using ultrasound as the sole exteroceptive sensor modality, this paper introduces EchoPT, a pretrained transformer architecture designed to predict 2D sonar images from previous sensory data and robot ego-motion information. We detail the transformer architecture that drives EchoPT and compare the performance of our model to several state-of-the-art techniques. In addition to presenting and evaluating our EchoPT model, we demonstrate the effectiveness of this predictive perception approach in two robotic tasks.

ROMay 7, 2021
LatentSLAM: unsupervised multi-sensor representation learning for localization and mapping

Ozan Çatal, Wouter Jansen, Tim Verbelen et al.

Biologically inspired algorithms for simultaneous localization and mapping (SLAM) such as RatSLAM have been shown to yield effective and robust robot navigation in both indoor and outdoor environments. One drawback however is the sensitivity to perceptual aliasing due to the template matching of low-dimensional sensory templates. In this paper, we propose an unsupervised representation learning method that yields low-dimensional latent state descriptors that can be used for RatSLAM. Our method is sensor agnostic and can be applied to any sensor modality, as we illustrate for camera images, radar range-doppler maps and lidar scans. We also show how combining multiple sensors can increase the robustness, by reducing the number of false matches. We evaluate on a dataset captured with a mobile robot navigating in a warehouse-like environment, moving through different aisles with similar appearance, making it hard for the SLAM algorithms to disambiguate locations.