Michael Hagenow

RO
h-index4
10papers
128citations
Novelty54%
AI Score47

10 Papers

ROMay 29
Hide-and-Seek in Trajectories: Discovering Failure Signals for VLA Runtime Monitoring

Seongheon Park, Wendi Li, Changdae Oh et al.

Vision-Language-Action (VLA) models enable robots to follow natural language instructions and generalize across diverse tasks, but they remain vulnerable to execution failures that compromise reliability in real-world deployment. Detecting such failures during execution is therefore critical for the robust deployment of embodied systems. Existing failure detection methods either rely on expensive action resampling or external models, while alternatives propagate trajectory-level labels uniformly across every timestep, obscuring localized failure signals. In this paper, we propose \textbf{Hide-and-Seek}, a framework that formulates VLA failure detection as a coarsely supervised learning problem. By combining inter-trajectory and intra-trajectory contrastive objectives, Hide-and-Seek localizes failure-indicative actions and induces temporally structured failure signals from trajectory-level supervision alone, without any step-level annotation. We evaluate Hide-and-Seek on LIBERO, VLABench, and a real-world robotic platform across three representative VLA policies: OpenVLA, $π_0$, and $π_{0.5}$.Our method achieves state-of-the-art multi-task failure detection performance with a practical accuracy--timeliness trade-off under conformal prediction, and generalizes well to both seen and unseen tasks.

ROMar 25, 2024
Grounding Language Plans in Demonstrations Through Counterfactual Perturbations

Yanwei Wang, Tsun-Hsuan Wang, Jiayuan Mao et al.

Grounding the common-sense reasoning of Large Language Models (LLMs) in physical domains remains a pivotal yet unsolved problem for embodied AI. Whereas prior works have focused on leveraging LLMs directly for planning in symbolic spaces, this work uses LLMs to guide the search of task structures and constraints implicit in multi-step demonstrations. Specifically, we borrow from manipulation planning literature the concept of mode families, which group robot configurations by specific motion constraints, to serve as an abstraction layer between the high-level language representations of an LLM and the low-level physical trajectories of a robot. By replaying a few human demonstrations with synthetic perturbations, we generate coverage over the demonstrations' state space with additional successful executions as well as counterfactuals that fail the task. Our explanation-based learning framework trains an end-to-end differentiable neural network to predict successful trajectories from failures and as a by-product learns classifiers that ground low-level states and images in mode families without dense labeling. The learned grounding classifiers can further be used to translate language plans into reactive policies in the physical domain in an interpretable manner. We show our approach improves the interpretability and reactivity of imitation learning through 2D navigation and simulated and real robot manipulation tasks. Website: https://yanweiw.github.io/glide

ROApr 21
Multi-Cycle Spatio-Temporal Adaptation in Human-Robot Teaming

Alex Cuellar, Michael Hagenow, Julie Shah

Effective human-robot teaming is crucial for the practical deployment of robots in human workspaces. However, optimizing joint human-robot plans remains a challenge due to the difficulty of modeling individualized human capabilities and preferences. While prior research has leveraged the multi-cycle structure of domains like manufacturing to learn an individual's tendencies and adapt plans over repeated interactions, these techniques typically consider task-level and motion-level adaptation in isolation. Task-level methods optimize allocation and scheduling but often ignore spatial interference in close-proximity scenarios; conversely, motion-level methods focus on collision avoidance while ignoring the broader task context. This paper introduces RAPIDDS, a framework that unifies these approaches by modeling an individual's spatial behavior (motion paths) and temporal behavior (time required to complete tasks) over multiple cycles. RAPIDDS then jointly adapts task schedules and steers diffusion models of robot motions to maximize efficiency and minimize proximity accounting for these individualized models. We demonstrate the importance of this dual adaptation through an ablation study in simulation and a physical robot scenario using a 7-DOF robot arm. Finally, we present a user study (n=32) showing significant plan improvement compared to non-adaptive systems across both objective metrics, such as efficiency and proximity, and subjective measures, including fluency and user preference. See this paper's companion video at: https://youtu.be/55Q3lq1fINs.

ROSep 28, 2021
Affordance Template Registration via Human-in-the-loop Corrections

Michael Hagenow, Michael Zinn, Terrence Fong et al.

Affordance Templates (ATs) are a method for parameterizing objects for autonomous robot manipulations. In this approach, instances of an object are registered by positioning a model in a 3D environment, which requires a large amount of user input. We instead propose a registration method which combines autonomy and user corrections. For selected objects, the system determines both the model and corresponding pose autonomously. The user makes corrections only if the model or pose is incorrect. This method increases the level of autonomy compared to existing approaches which can reduce user input and time on task. In this paper, we present an overview of existing methods, a description of our method, preliminary results, and planned future work.

ROSep 6, 2021
Task-Level Authoring for Remote Robot Teleoperation

Emmanuel Senft, Michael Hagenow, Kevin Welsh et al.

Remote teleoperation of robots can broaden the reach of domain specialists across a wide range of industries such as home maintenance, health care, light manufacturing, and construction. However, current direct control methods are impractical, and existing tools for programming robot remotely have focused on users with significant robotic experience. Extending robot remote programming to end users, i.e., users who are experts in a domain but novices in robotics, requires tools that balance the rich features necessary for complex teleoperation tasks with ease of use. The primary challenge to usability is that novice users are unable to specify complete and robust task plans to allow a robot to perform duties autonomously, particularly in highly variable environments. Our solution is to allow operators to specify shorter sequences of high-level commands, which we call task-level authoring, to create periods of variable robot autonomy. This approach allows inexperienced users to create robot behaviors in uncertain environments by interleaving exploration, specification of behaviors, and execution as separate steps. End users are able to break down the specification of tasks and adapt to the current needs of the interaction and environments, combining the reactivity of direct control to asynchronous operation. In this paper, we describe a prototype system contextualized in light manufacturing and its empirical validation in a user study where 18 participants with some programming experience were able to perform a variety of complex telemanipulation tasks with little training. Our results show that our approach allowed users to create flexible periods of autonomy and solve rich manipulation tasks. Furthermore, participants significantly preferred our system over comparative more direct interfaces, demonstrating the potential of our approach.

ROAug 10, 2021
Recognizing Orientation Slip in Human Demonstrations

Michael Hagenow, Bolun Zhang, Bilge Mutlu et al.

Manipulations of a constrained object often use a non-rigid grasp that allows the object to rotate relative to the end effector. This orientation slip strategy is often present in natural human demonstrations, yet it is generally overlooked in methods to identify constraints from such demonstrations. In this paper, we present a method to model and recognize prehensile orientation slip in human demonstrations of constrained interactions. Using only observations of an end effector, we can detect the type of constraint, parameters of the constraint, and orientation slip properties. Our method uses a novel hierarchical model selection method that is informed by multiple origins of physics-based evidence. A study with eight participants shows that orientation slip occurs in natural demonstrations and confirms that it can be detected by our method.

ROAug 8, 2021
Situated Live Programming for Human-Robot Collaboration

Emmanuel Senft, Michael Hagenow, Robert Radwin et al.

We present situated live programming for human-robot collaboration, an approach that enables users with limited programming experience to program collaborative applications for human-robot interaction. Allowing end users, such as shop floor workers, to program collaborative robots themselves would make it easy to "retask" robots from one process to another, facilitating their adoption by small and medium enterprises. Our approach builds on the paradigm of trigger-action programming (TAP) by allowing end users to create rich interactions through simple trigger-action pairings. It enables end users to iteratively create, edit, and refine a reactive robot program while executing partial programs. This live programming approach enables the user to utilize the task space and objects by incrementally specifying situated trigger-action pairs, substantially lowering the barrier to entry for programming or reprogramming robots for collaboration. We instantiate situated live programming in an authoring system where users can create trigger-action programs by annotating an augmented video feed from the robot's perspective and assign robot actions to trigger conditions. We evaluated this system in a study where participants (n = 10) developed robot programs for solving collaborative light-manufacturing tasks. Results showed that users with little programming experience were able to program HRC tasks in an interactive fashion and our situated live programming approach further supported individualized strategies and workflows. We conclude by discussing opportunities and limitations of the proposed approach, our system implementation, and our study and discuss a roadmap for expanding this approach to a broader range of tasks and applications.

ROJul 10, 2021
Informing Real-time Corrections in Corrective Shared Autonomy Through Expert Demonstrations

Michael Hagenow, Emmanuel Senft, Robert Radwin et al.

Corrective Shared Autonomy is a method where human corrections are layered on top of an otherwise autonomous robot behavior. Specifically, a Corrective Shared Autonomy system leverages an external controller to allow corrections across a range of task variables (e.g., spinning speed of a tool, applied force, path) to address the specific needs of a task. However, this inherent flexibility makes the choice of what corrections to allow at any given instant difficult to determine. This choice of corrections includes determining appropriate robot state variables, scaling for these variables, and a way to allow a user to specify the corrections in an intuitive manner. This paper enables efficient Corrective Shared Autonomy by providing an automated solution based on Learning from Demonstration to both extract the nominal behavior and address these core problems. Our evaluation shows that this solution enables users to successfully complete a surface cleaning task, identifies different strategies users employed in applying corrections, and points to future improvements for our solution.

ROFeb 14, 2021
Corrective Shared Autonomy for Addressing Task Variability

Michael Hagenow, Emmanuel Senft, Robert Radwin et al.

Many tasks, particularly those involving interaction with the environment, are characterized by high variability, making robotic autonomy difficult. One flexible solution is to introduce the input of a human with superior experience and cognitive abilities as part of a shared autonomy policy. However, current methods for shared autonomy are not designed to address the wide range of necessary corrections (e.g., positions, forces, execution rate, etc.) that the user may need to provide to address task variability. In this paper, we present corrective shared autonomy, where users provide corrections to key robot state variables on top of an otherwise autonomous task model. We provide an instantiation of this shared autonomy paradigm and demonstrate its viability and benefits such as low user effort and physical demand via a system-level user study on three tasks involving variability situated in aircraft manufacturing.

ROOct 29, 2020
A Method for Constraint Inference Using Pose and Wrench Measurements

Guru Subramani, Michael Hagenow, Michael Gleicher et al.

Many physical tasks such as pulling out a drawer or wiping a table can be modeled with geometric constraints. These geometric constraints are characterized by restrictions on kinematic trajectories and reaction wrenches (forces and moments) of objects under the influence of the constraint. This paper presents a method to infer geometric constraints involving unmodeled objects in human demonstrations using both kinematic and wrench measurements. Our approach takes a recording of a human demonstration and determines what constraints are present, when they occur, and their parameters (e.g. positions). By using both kinematic and wrench information, our methods are able to reliably identify a variety of constraint types, even if the constraints only exist for short durations within the demonstration. We present a systematic approach to fitting arbitrary scleronomic constraint models to kinematic and wrench measurements. Reaction forces are estimated from measurements by removing friction. Position, orientation, force, and moment error metrics are developed to provide systematic comparison between constraint models. By conducting a user study, we show that our methods can reliably identify constraints in realistic situations and confirm the value of including forces and moments in the model regression and selection process.