Rüdiger Dillmann

RO
h-index11
12papers
141citations
Novelty46%
AI Score42

12 Papers

21.1ROMay 19
A Practical Framework of Key Performance Indicators for Multi-Robot Lunar and Planetary Field Tests

Julia Richter, David Oberacker, Gabriela Ligeza et al.

Robotic prospecting for critical resources on the Moon, such as ilmenite, rare earth elements, and water ice, requires robust exploration methods given the diverse terrain and harsh environmental conditions. Although numerous analog field trials address these goals, comparing their results remains challenging because of differences in robot platforms and experimental setups. These missions typically assess performance using selected, scenario-specific engineering metrics that fail to establish a clear link between field performance and science-driven objectives. In this paper, we address this gap by deriving a structured framework of KPI from three realistic multi-robot lunar scenarios reflecting scientific objectives and operational constraints. Our framework emphasizes scenario-dependent priorities in efficiency, robustness, and precision, and is explicitly designed for practical applicability in field deployments. We validated the framework in a multi-robot field test and found it practical and easy to apply for efficiency- and robustness-related KPI, whereas precision-oriented KPI require reliable ground-truth data that is not always feasible to obtain in outdoor analog environments. Overall, we propose this framework as a common evaluation standard enabling consistent, goal-oriented comparison of multi-robot field trials and supporting systematic development of robotic systems for future planetary exploration.

ROMar 18, 2022
Comparing SONN Types for Efficient Robot Motion Planning in the Configuration Space

Lea Steffen, Tobias Weyer, Katharina Glueck et al.

Motion planning in the configuration space (C-space) induces benefits, such as smooth trajectories. It becomes more complex as the degrees of freedom (DOF) increase. This is due to the direct relation between the dimensionality of the search space and the DOF. Self-organizing neural networks (SONN) and their famous candidate, the Self-Organizing Map, have been proven to be useful tools for C-space reduction while preserving its underlying topology, as presented in [29]. In this work, we extend our previous study with additional models and adapt the approach from human motion data towards robots' kinematics. The evaluation includes the best performant models from [29] and three additional SONN architectures, representing the consequent continuation of this previous work. Generated Trajectories, planned with the different SONN models, were successfully tested in a robot simulation.

ROMar 22, 2023
Learning Human-Inspired Force Strategies for Robotic Assembly

Stefan Scherzinger, Arne Roennau, Rüdiger Dillmann

The programming of robotic assembly tasks is a key component in manufacturing and automation. Force-sensitive assembly, however, often requires reactive strategies to handle slight changes in positioning and unforeseen part jamming. Learning such strategies from human performance is a promising approach, but faces two common challenges: the handling of low part clearances which is difficult to capture from demonstrations and learning intuitive strategies offline without access to the real hardware. We address these two challenges by learning probabilistic force strategies from data that are easily acquired offline in a robot-less simulation from human demonstrations with a joystick. We combine a Long Short Term Memory (LSTM) and a Mixture Density Network (MDN) to model human-inspired behavior in such a way that the learned strategies transfer easily onto real hardware. The experiments show a UR10e robot that completes a plastic assembly with clearances of less than 100 micrometers whose strategies were solely demonstrated in simulation.

ROJan 30, 2024Code
Efficient Gesture Recognition on Spiking Convolutional Networks Through Sensor Fusion of Event-Based and Depth Data

Lea Steffen, Thomas Trapp, Arne Roennau et al.

As intelligent systems become increasingly important in our daily lives, new ways of interaction are needed. Classical user interfaces pose issues for the physically impaired and are partially not practical or convenient. Gesture recognition is an alternative, but often not reactive enough when conventional cameras are used. This work proposes a Spiking Convolutional Neural Network, processing event- and depth data for gesture recognition. The network is simulated using the open-source neuromorphic computing framework LAVA for offline training and evaluation on an embedded system. For the evaluation three open source data sets are used. Since these do not represent the applied bi-modality, a new data set with synchronized event- and depth data was recorded. The results show the viability of temporal encoding on depth information and modality fusion, even on differently encoded data, to be beneficial to network performance and generalization capabilities.

ROFeb 18, 2022Code
Motion Macro Programming on Assistive Robotic Manipulators: Three Skill Types for Everyday Tasks

Stefan Scherzinger, Pascal Becker, Arne Roennau et al.

Assistive robotic manipulators are becoming increasingly important for people with disabilities. Teleoperating the manipulator in mundane tasks is part of their daily lives. Instead of steering the robot through all actions, applying self-recorded motion macros could greatly facilitate repetitive tasks. Dynamic Movement Primitives (DMP) are a powerful method for skill learning via teleoperation. For this use case, however, they need simple heuristics to specify where to start, stop, and parameterize a skill without a background in computer science and academic sensor setups for autonomous perception. To achieve this goal, this paper provides the concept of local, global, and hybrid skills that form a modular basis for composing single-handed tasks of daily living. These skills are specified implicitly and can easily be programmed by users themselves, requiring only their basic robotic manipulator. The paper contributes all details for robot-agnostic implementations. Experiments validate the developed methods for exemplary tasks, such as scratching an itchy spot, sorting objects on a desk, and feeding a piggy bank with coins. The paper is accompanied by an open-source implementation at https://github.com/fzi-forschungszentrum-informatik/ArNe

RODec 21, 2023
EfficientPPS: Part-aware Panoptic Segmentation of Transparent Objects for Robotic Manipulation

Benjamin Alt, Minh Dang Nguyen, Andreas Hermann et al.

The use of autonomous robots for assistance tasks in hospitals has the potential to free up qualified staff and im-prove patient care. However, the ubiquity of deformable and transparent objects in hospital settings poses signif-icant challenges to vision-based perception systems. We present EfficientPPS, a neural architecture for part-aware panoptic segmentation that provides robots with semantically rich visual information for grasping and ma-nipulation tasks. We also present an unsupervised data collection and labelling method to reduce the need for human involvement in the training process. EfficientPPS is evaluated on a dataset containing real-world hospital objects and demonstrated to be robust and efficient in grasping transparent transfusion bags with a collaborative robot arm.

ROSep 24, 2020
Virtual Forward Dynamics Models for Cartesian Robot Control

Stefan Scherzinger, Arne Roennau, Rüdiger Dillmann

In industrial context, admittance control represents an important scheme in programming robots for interaction tasks with their environments. Those robots usually implement high-gain disturbance rejection on joint-level and hide direct access to the actuators behind velocity or position controlled interfaces. Using wrist force-torque sensors to add compliance to these systems, force-resolved control laws must map the control signals from Cartesian space to joint motion. Although forward dynamics algorithms would perfectly fit to that task description, their application to Cartesian robot control is not well researched. This paper proposes a general concept of virtual forward dynamics models for Cartesian robot control and investigates how the forward mapping behaves in comparison to well-established alternatives. Through decreasing the virtual system's link masses in comparison to the end effector, the virtual system becomes linear in the operational space dynamics. Experiments focus on stability and manipulability, particularly in singular configurations. Our results show that through this trick, forward dynamics can combine both benefits of the Jacobian inverse and the Jacobian transpose and, in this regard, outperforms the Damped Least Squares method.

ROAug 17, 2019
Contact Skill Imitation Learning for Robot-Independent Assembly Programming

Stefan Scherzinger, Arne Roennau, Rüdiger Dillmann

Robotic automation is a key driver for the advancement of technology. The skills of human workers, however, are difficult to program and seem currently unmatched by technical systems. In this work we present a data-driven approach to extract and learn robot-independent contact skills from human demonstrations in simulation environments, using a Long Short Term Memory (LSTM) network. Our model learns to generate error-correcting sequences of forces and torques in task space from object-relative motion, which industrial robots carry out through a Cartesian force control scheme on the real setup. This scheme uses forward dynamics computation of a virtually conditioned twin of the manipulator to solve the inverse kinematics problem. We evaluate our methods with an assembly experiment, in which our algorithm handles part tilting and jamming in order to succeed. The results show that the skill is robust towards localization uncertainty in task space and across different joint configurations of the robot. With our approach, non-experts can easily program force-sensitive assembly tasks in a robot-independent way.

ROAug 17, 2019
Inverse Kinematics with Forward Dynamics Solvers for Sampled Motion Tracking

Stefan Scherzinger, Arne Roennau, Rüdiger Dillmann

Tracking Cartesian motion with end~effectors is a fundamental task in robot control. For motion that is not known in advance, the solvers must find fast solutions to the inverse kinematics (IK) problem for discretely sampled target poses. On joint control level, however, the robot's actuators operate in a continuous domain, requiring smooth transitions between individual states. In this work, we present a boost to the well-known Jacobian transpose method to address this goal, using the mass matrix of a virtually conditioned twin of the manipulator. Results on the UR10 show superior convergence and quality of our dynamics-based solver against the plain Jacobian method. Our algorithm is straightforward to implement as a controller, using common robotics libraries.

NEApr 9, 2019
Embodied Neuromorphic Vision with Event-Driven Random Backpropagation

Jacques Kaiser, Alexander Friedrich, J. Camilo Vasquez Tieck et al.

Spike-based communication between biological neurons is sparse and unreliable. This enables the brain to process visual information from the eyes efficiently. Taking inspiration from biology, artificial spiking neural networks coupled with silicon retinas attempt to model these computations. Recent findings in machine learning allowed the derivation of a family of powerful synaptic plasticity rules approximating backpropagation for spiking networks. Are these rules capable of processing real-world visual sensory data? In this paper, we evaluate the performance of Event-Driven Random Back-Propagation (eRBP) at learning representations from event streams provided by a Dynamic Vision Sensor (DVS). First, we show that eRBP matches state-of-the-art performance on the DvsGesture dataset with the addition of a simple covert attention mechanism. By remapping visual receptive fields relatively to the center of the motion, this attention mechanism provides translation invariance at low computational cost compared to convolutions. Second, we successfully integrate eRBP in a real robotic setup, where a robotic arm grasps objects according to detected visual affordances. In this setup, visual information is actively sensed by a DVS mounted on a robotic head performing microsaccadic eye movements. We show that our method classifies affordances within 100ms after microsaccade onset, which is comparable to human performance reported in behavioral study. Our results suggest that advances in neuromorphic technology and plasticity rules enable the development of autonomous robots operating at high speed and low energy consumption.

CVAug 1, 2018
Real-time image-based instrument classification for laparoscopic surgery

Sebastian Bodenstedt, Antonia Ohnemus, Darko Katic et al.

During laparoscopic surgery, context-aware assistance systems aim to alleviate some of the difficulties the surgeon faces. To ensure that the right information is provided at the right time, the current phase of the intervention has to be known. Real-time locating and classification the surgical tools currently in use are key components of both an activity-based phase recognition and assistance generation. In this paper, we present an image-based approach that detects and classifies tools during laparoscopic interventions in real-time. First, potential instrument bounding boxes are detected using a pixel-wise random forest segmentation. Each of these bounding boxes is then classified using a cascade of random forest. For this, multiple features, such as histograms over hue and saturation, gradients and SURF feature, are extracted from each detected bounding box. We evaluated our approach on five different videos from two different types of procedures. We distinguished between the four most common classes of instruments (LigaSure, atraumatic grasper, aspirator, clip applier) and background. Our method succesfully located up to 86% of all instruments respectively. On manually provided bounding boxes, we achieve a instrument type recognition rate of up to 58% and on automatically detected bounding boxes up to 49%. To our knowledge, this is the first approach that allows an image-based classification of surgical tools in a laparoscopic setting in real-time.

CVFeb 13, 2017
Unsupervised temporal context learning using convolutional neural networks for laparoscopic workflow analysis

Sebastian Bodenstedt, Martin Wagner, Darko Katić et al.

Computer-assisted surgery (CAS) aims to provide the surgeon with the right type of assistance at the right moment. Such assistance systems are especially relevant in laparoscopic surgery, where CAS can alleviate some of the drawbacks that surgeons incur. For many assistance functions, e.g. displaying the location of a tumor at the appropriate time or suggesting what instruments to prepare next, analyzing the surgical workflow is a prerequisite. Since laparoscopic interventions are performed via endoscope, the video signal is an obvious sensor modality to rely on for workflow analysis. Image-based workflow analysis tasks in laparoscopy, such as phase recognition, skill assessment, video indexing or automatic annotation, require a temporal distinction between video frames. Generally computer vision based methods that generalize from previously seen data are used. For training such methods, large amounts of annotated data are necessary. Annotating surgical data requires expert knowledge, therefore collecting a sufficient amount of data is difficult, time-consuming and not always feasible. In this paper, we address this problem by presenting an unsupervised method for training a convolutional neural network (CNN) to differentiate between laparoscopic video frames on a temporal basis. We extract video frames at regular intervals from 324 unlabeled laparoscopic interventions, resulting in a dataset of approximately 2.2 million images. From this dataset, we extract image pairs from the same video and train a CNN to determine their temporal order. To solve this problem, the CNN has to extract features that are relevant for comprehending laparoscopic workflow. Furthermore, we demonstrate that such a CNN can be adapted for surgical workflow segmentation. We performed image-based workflow segmentation on a publicly available dataset of 7 cholecystectomies and 9 colorectal interventions.