Arnoud Visser

RO
h-index7
5papers
2citations
Novelty16%
AI Score21

5 Papers

ROJul 8, 2024Code
An Earth Rover dataset recorded at the ICRA@40 party

Qi Zhang, Zhihao Lin, Arnoud Visser

The ICRA conference is celebrating its $40^{th}$ anniversary in Rotterdam in September 2024, with as highlight the Happy Birthday ICRA Party at the iconic Holland America Line Cruise Terminal. One month later the IROS conference will take place, which will include the Earth Rover Challenge. In this challenge open-world autonomous navigation models are studied truly open-world settings. As part of the Earth Rover Challenge several real-world navigation sets in several cities world-wide, like Auckland, Australia and Wuhan, China. The only dataset recorded in the Netherlands is the small village Oudewater. The proposal is to record a dataset with the robot used in the Earth Rover Challenge in Rotterdam, in front of the Holland America Line Cruise Terminal, before the festivities of the Happy Birthday ICRA Party start. See: https://github.com/SlamMate/vSLAM-on-FrodoBots-2K

ROSep 5, 2024
Bringing the RT-1-X Foundation Model to a SCARA robot

Jonathan Salzer, Arnoud Visser

Traditional robotic systems require specific training data for each task, environment, and robot form. While recent advancements in machine learning have enabled models to generalize across new tasks and environments, the challenge of adapting these models to entirely new settings remains largely unexplored. This study addresses this by investigating the generalization capabilities of the RT-1-X robotic foundation model to a type of robot unseen during its training: a SCARA robot from UMI-RTX. Initial experiments reveal that RT-1-X does not generalize zero-shot to the unseen type of robot. However, fine-tuning of the RT-1-X model by demonstration allows the robot to learn a pickup task which was part of the foundation model (but learned for another type of robot). When the robot is presented with an object that is included in the foundation model but not in the fine-tuning dataset, it demonstrates that only the skill, but not the object-specific knowledge, has been transferred.

ROJul 3, 2024
Position and Altitude of the Nao Camera Head from Two Points on the Soccer Field plus the Gravitational Direction

Stijn Oomes, Arnoud Visser

To be able to play soccer, a robot needs a good estimate of its current position on the field. Ideally, multiple features are visible that have known locations. By applying trigonometry we can estimate the viewpoint from where this observation was actually made. Given that the Nao robots of the Standard Platform League have quite a limited field of view, a given camera frame typically only allows for one or two points to be recognized. In this paper we propose a method for determining the (x, y) coordinates on the field and the height h of the camera from the geometry of a simplified tetrahedron. This configuration is formed by two observed points on the ground plane plus the gravitational direction. When the distance between the two points is known, and the directions to the points plus the gravitational direction are measured, all dimensions of the tetrahedron can be determined. By performing these calculations with rational trigonometry instead of classical trigonometry, the computations turn out to be 28.7% faster, with equal numerical accuracy. The position of the head of the Nao can also be externally measured with the OptiTrack system. The difference between externally measured and internally predicted position from sensor data gives us mean absolute errors in the 3-6 centimeters range, when we estimated the gravitational direction from the vanishing point of the outer edges of the goal posts.

CVMay 27, 2025
Supervised and self-supervised land-cover segmentation & classification of the Biesbosch wetlands

Eva Gmelich Meijling, Roberto Del Prete, Arnoud Visser

Accurate wetland land-cover classification is essential for environmental monitoring, biodiversity assessment, and sustainable ecosystem management. However, the scarcity of annotated data, especially for high-resolution satellite imagery, poses a significant challenge for supervised learning approaches. To tackle this issue, this study presents a methodology for wetland land-cover segmentation and classification that adopts both supervised and self-supervised learning (SSL). We train a U-Net model from scratch on Sentinel-2 imagery across six wetland regions in the Netherlands, achieving a baseline model accuracy of 85.26%. Addressing the limited availability of labeled data, the results show that SSL pretraining with an autoencoder can improve accuracy, especially for the high-resolution imagery where it is more difficult to obtain labeled data, reaching an accuracy of 88.23%. Furthermore, we introduce a framework to scale manually annotated high-resolution labels to medium-resolution inputs. While the quantitative performance between resolutions is comparable, high-resolution imagery provides significantly sharper segmentation boundaries and finer spatial detail. As part of this work, we also contribute a curated Sentinel-2 dataset with Dynamic World labels, tailored for wetland classification tasks and made publicly available.

CVJun 27, 2019
A shallow residual neural network to predict the visual cortex response

Anne-Ruth José Meijer, Arnoud Visser

Understanding how the visual cortex of the human brain really works is still an open problem for science today. A better understanding of natural intelligence could also benefit object-recognition algorithms based on convolutional neural networks. In this paper we demonstrate the asset of using a shallow residual neural network for this task. The benefit of this approach is that earlier stages of the network can be accurately trained, which allows us to add more layers at the earlier stage. With this additional layer the prediction of the visual brain activity improves from $10.4\%$ (block 1) to $15.53\%$ (last fully connected layer). By training the network for more than 10 epochs this improvement can become even larger.