Antonio Krüger

HC
h-index169
12papers
128citations
Novelty32%
AI Score36

12 Papers

RONov 29, 2023Code
Toward a Surgeon-in-the-Loop Ophthalmic Robotic Apprentice using Reinforcement and Imitation Learning

Amr Gomaa, Bilal Mahdy, Niko Kleer et al.

Robot-assisted surgical systems have demonstrated significant potential in enhancing surgical precision and minimizing human errors. However, existing systems cannot accommodate individual surgeons' unique preferences and requirements. Additionally, they primarily focus on general surgeries (e.g., laparoscopy) and are unsuitable for highly precise microsurgeries, such as ophthalmic procedures. Thus, we propose an image-guided approach for surgeon-centered autonomous agents that can adapt to the individual surgeon's skill level and preferred surgical techniques during ophthalmic cataract surgery. Our approach trains reinforcement and imitation learning agents simultaneously using curriculum learning approaches guided by image data to perform all tasks of the incision phase of cataract surgery. By integrating the surgeon's actions and preferences into the training process, our approach enables the robot to implicitly learn and adapt to the individual surgeon's unique techniques through surgeon-in-the-loop demonstrations. This results in a more intuitive and personalized surgical experience for the surgeon while ensuring consistent performance for the autonomous robotic apprentice. We define and evaluate the effectiveness of our approach in a simulated environment using our proposed metrics and highlight the trade-off between a generic agent and a surgeon-centered adapted agent. Finally, our approach has the potential to extend to other ophthalmic and microsurgical procedures, opening the door to a new generation of surgeon-in-the-loop autonomous surgical robots. We provide an open-source simulation framework for future development and reproducibility at https://github.com/amrgomaaelhady/CataractAdaptSurgRobot.

ROJul 7, 2023
Teach Me How to Learn: A Perspective Review towards User-centered Neuro-symbolic Learning for Robotic Surgical Systems

Amr Gomaa, Bilal Mahdy, Niko Kleer et al.

Recent advances in machine learning models allowed robots to identify objects on a perceptual nonsymbolic level (e.g., through sensor fusion and natural language understanding). However, these primarily black-box learning models still lack interpretation and transferability and require high data and computational demand. An alternative solution is to teach a robot on both perceptual nonsymbolic and conceptual symbolic levels through hybrid neurosymbolic learning approaches with expert feedback (i.e., human-in-the-loop learning). This work proposes a concept for this user-centered hybrid learning paradigm that focuses on robotic surgical situations. While most recent research focused on hybrid learning for non-robotic and some generic robotic domains, little work focuses on surgical robotics. We survey this related research while focusing on human-in-the-loop surgical robotic systems. This evaluation highlights the most prominent solutions for autonomous surgical robots and the challenges surgeons face when interacting with these systems. Finally, we envision possible ways to address these challenges using online apprenticeship learning based on implicit and explicit feedback from expert surgeons.

CVSep 8, 2023Code
SynthoGestures: A Novel Framework for Synthetic Dynamic Hand Gesture Generation for Driving Scenarios

Amr Gomaa, Robin Zitt, Guillermo Reyes et al.

Creating a diverse and comprehensive dataset of hand gestures for dynamic human-machine interfaces in the automotive domain can be challenging and time-consuming. To overcome this challenge, we propose using synthetic gesture datasets generated by virtual 3D models. Our framework utilizes Unreal Engine to synthesize realistic hand gestures, offering customization options and reducing the risk of overfitting. Multiple variants, including gesture speed, performance, and hand shape, are generated to improve generalizability. In addition, we simulate different camera locations and types, such as RGB, infrared, and depth cameras, without incurring additional time and cost to obtain these cameras. Experimental results demonstrate that our proposed framework, SynthoGestures (https://github.com/amrgomaaelhady/SynthoGestures), improves gesture recognition accuracy and can replace or augment real-hand datasets. By saving time and effort in the creation of the data set, our tool accelerates the development of gesture recognition systems for automotive applications.

HCOct 22, 2024Code
AdaptoML-UX: An Adaptive User-centered GUI-based AutoML Toolkit for Non-AI Experts and HCI Researchers

Amr Gomaa, Michael Sargious, Antonio Krüger

The increasing integration of machine learning across various domains has underscored the necessity for accessible systems that non-experts can utilize effectively. To address this need, the field of automated machine learning (AutoML) has developed tools to simplify the construction and optimization of ML pipelines. However, existing AutoML solutions often lack efficiency in creating online pipelines and ease of use for Human-Computer Interaction (HCI) applications. Therefore, in this paper, we introduce AdaptoML-UX, an adaptive framework that incorporates automated feature engineering, machine learning, and incremental learning to assist non-AI experts in developing robust, user-centered ML models. Our toolkit demonstrates the capability to adapt efficiently to diverse problem domains and datasets, particularly in HCI, thereby reducing the necessity for manual experimentation and conserving time and resources. Furthermore, it supports model personalization through incremental learning, customizing models to individual user behaviors. HCI researchers can employ AdaptoML-UX (\url{https://github.com/MichaelSargious/AdaptoML_UX}) without requiring specialized expertise, as it automates the selection of algorithms, feature engineering, and hyperparameter tuning based on the unique characteristics of the data.

HCJan 29, 2024Code
Looking for a better fit? An Incremental Learning Multimodal Object Referencing Framework adapting to Individual Drivers

Amr Gomaa, Guillermo Reyes, Michael Feld et al.

The rapid advancement of the automotive industry towards automated and semi-automated vehicles has rendered traditional methods of vehicle interaction, such as touch-based and voice command systems, inadequate for a widening range of non-driving related tasks, such as referencing objects outside of the vehicle. Consequently, research has shifted toward gestural input (e.g., hand, gaze, and head pose gestures) as a more suitable mode of interaction during driving. However, due to the dynamic nature of driving and individual variation, there are significant differences in drivers' gestural input performance. While, in theory, this inherent variability could be moderated by substantial data-driven machine learning models, prevalent methodologies lean towards constrained, single-instance trained models for object referencing. These models show a limited capacity to continuously adapt to the divergent behaviors of individual drivers and the variety of driving scenarios. To address this, we propose \textit{IcRegress}, a novel regression-based incremental learning approach that adapts to changing behavior and the unique characteristics of drivers engaged in the dual task of driving and referencing objects. We suggest a more personalized and adaptable solution for multimodal gestural interfaces, employing continuous lifelong learning to enhance driver experience, safety, and convenience. Our approach was evaluated using an outside-the-vehicle object referencing use case, highlighting the superiority of the incremental learning models adapted over a single trained model across various driver traits such as handedness, driving experience, and numerous driving conditions. Finally, to facilitate reproducibility, ease deployment, and promote further research, we offer our approach as an open-source framework at \url{https://github.com/amrgomaaelhady/IcRegress}.

CYJan 29, 2025
International AI Safety Report

Yoshua Bengio, Sören Mindermann, Daniel Privitera et al. · eth-zurich, mit

The first International AI Safety Report comprehensively synthesizes the current evidence on the capabilities, risks, and safety of advanced AI systems. The report was mandated by the nations attending the AI Safety Summit in Bletchley, UK. Thirty nations, the UN, the OECD, and the EU each nominated a representative to the report's Expert Advisory Panel. A total of 100 AI experts contributed, representing diverse perspectives and disciplines. Led by the report's Chair, these independent experts collectively had full discretion over the report's content.

HCAug 3, 2025
Implicit Search Intent Recognition using EEG and Eye Tracking: Novel Dataset and Cross-User Prediction

Mansi Sharma, Shuang Chen, Philipp Müller et al.

For machines to effectively assist humans in challenging visual search tasks, they must differentiate whether a human is simply glancing into a scene (navigational intent) or searching for a target object (informational intent). Previous research proposed combining electroencephalography (EEG) and eye-tracking measurements to recognize such search intents implicitly, i.e., without explicit user input. However, the applicability of these approaches to real-world scenarios suffers from two key limitations. First, previous work used fixed search times in the informational intent condition -- a stark contrast to visual search, which naturally terminates when the target is found. Second, methods incorporating EEG measurements addressed prediction scenarios that require ground truth training data from the target user, which is impractical in many use cases. We address these limitations by making the first publicly available EEG and eye-tracking dataset for navigational vs. informational intent recognition, where the user determines search times. We present the first method for cross-user prediction of search intents from EEG and eye-tracking recordings and reach 84.5% accuracy in leave-one-user-out evaluations -- comparable to within-user prediction accuracy (85.5%) but offering much greater flexibility

CVAug 3, 2025
Distinguishing Target and Non-Target Fixations with EEG and Eye Tracking in Realistic Visual Scenes

Mansi Sharma, Camilo Andrés Martínez Martínez, Benedikt Emanuel Wirth et al.

Distinguishing target from non-target fixations during visual search is a fundamental building block to understand users' intended actions and to build effective assistance systems. While prior research indicated the feasibility of classifying target vs. non-target fixations based on eye tracking and electroencephalography (EEG) data, these studies were conducted with explicitly instructed search trajectories, abstract visual stimuli, and disregarded any scene context. This is in stark contrast with the fact that human visual search is largely driven by scene characteristics and raises questions regarding generalizability to more realistic scenarios. To close this gap, we, for the first time, investigate the classification of target vs. non-target fixations during free visual search in realistic scenes. In particular, we conducted a 36-participants user study using a large variety of 140 realistic visual search scenes in two highly relevant application scenarios: searching for icons on desktop backgrounds and finding tools in a cluttered workshop. Our approach based on gaze and EEG features outperforms the previous state-of-the-art approach based on a combination of fixation duration and saccade-related potentials. We perform extensive evaluations to assess the generalizability of our approach across scene types. Our approach significantly advances the ability to distinguish between target and non-target fixations in realistic scenarios, achieving 83.6% accuracy in cross-user evaluations. This substantially outperforms previous methods based on saccade-related potentials, which reached only 56.9% accuracy.

HCJul 27, 2021
Design Guidelines to Increase the Persuasiveness of Achievement Goals for Physical Activity

Maximilian Altmeyer, Pascal Lessel, Atiq Ur Rehman Waqar et al.

Achievement goals are frequently used to support behavior change. However, they are often not specifically designed for this purpose nor account for the degree to which a user is already intending to perform the target behavior. In this paper, we investigate the perceived persuasiveness of different goal types as defined by the 3x2 Achievement Goal Model, what people like and dislike about them and the role that behavior change intentions play when aiming at increasing step counts. We created visualizations for each goal type based on a qualitative pre-study (N=18) and ensured their comprehensibility (N=18). In an online experiment (N=118), we show that there are differences in the perception of these goal types and that behavior change intentions should be considered to maximize their persuasiveness as goals evolve. Next, we derive design guidelines on when to use which type of achievement goal and what to consider when using them

HCJul 27, 2021
A Long-Term Investigation on the Effects of (Personalized) Gamification on Course Participation in a Gym

Maximilian Altmeyer, Marc Schubhan, Antonio Krüger et al.

Gamification is frequently used to motivate people getting more physically active. However, most systems follow a one-size-fits-all gamification approach, although past research has shown that interpersonal differences exist in the perception of gamification elements. Also, most studies investigating the effects of gamification are rather short, although it has been shown that gamification can suffer from novelty effects. In this paper, we address both these issues by investigating whether gamification elements, integrated into a fitness course booking system, have an effect on how frequently users participate in fitness courses in a gym (N=52) over a duration of 275 days (548 days including baseline). Also, the gamification elements that we implemented are tailored to specific Hexad user types, which allows us to investigate whether using suitable gamification elements leads to an increased course participation. Our results show that gamification increased the participation in fitness courses significantly and that users who received a suitable set of gamification elements - according to their Hexad user type - increased their participation significantly more than others.

HCJan 16, 2021
Evaluating User Experiences in Mixed Reality

Dmitry Alexandrovsky, Susanne Putze, Valentin Schwind et al.

Measure user experience in MR (i.e., AR/VR) user studies is essential. Researchers apply a wide range of measuring methods using objective (e.g., biosignals, time logging), behavioral (e.g., gaze direction, movement amplitude), and subjective (e.g., standardized questionnaires) metrics. Many of these measurement instruments were adapted from use-cases outside of MR but have not been validated for usage in MR experiments. However, researchers are faced with various challenges and design alternatives when measuring immersive experiences. These challenges become even more diverse when running out-of-the lab studies. Measurement methods of VR experience recently received much attention. For example, research has started embedding questionnaires in the VE for various applications, allowing users to stay closer to the ongoing experience while filling out the survey. However, there is a diversity in the interaction methods and practices on how the assessment procedure is conducted. This diversity in methods underlines a missing shared agreement of standardized measurement tools for VR experiences. AR research strongly orients on the research methods from VR, e.g., using the same type of subjective questionnaires. However, some crucial technical differences require careful considerations during the evaluation. This workshop at CHI 2021 provides a foundation to exchange expertise and address challenges and opportunities of research methods in MR user studies. By this, our workshop launches a discussion of research methods that should lead to standardizing assessment methods in MR user studies. The outcomes of the workshop will be aggregated into a collective special issue journal article.

HCMar 7, 2019
Integrating Artificial and Human Intelligence for Efficient Translation

Nico Herbig, Santanu Pal, Josef van Genabith et al.

Current advances in machine translation increase the need for translators to switch from traditional translation to post-editing of machine-translated text, a process that saves time and improves quality. Human and artificial intelligence need to be integrated in an efficient way to leverage the advantages of both for the translation task. This paper outlines approaches at this boundary of AI and HCI and discusses open research questions to further advance the field.