Kris Hauser

RO
h-index10
24papers
1,043citations
Novelty50%
AI Score48

24 Papers

ROJul 10, 2024
AdaptiGraph: Material-Adaptive Graph-Based Neural Dynamics for Robotic Manipulation

Kaifeng Zhang, Baoyu Li, Kris Hauser et al.

Predictive models are a crucial component of many robotic systems. Yet, constructing accurate predictive models for a variety of deformable objects, especially those with unknown physical properties, remains a significant challenge. This paper introduces AdaptiGraph, a learning-based dynamics modeling approach that enables robots to predict, adapt to, and control a wide array of challenging deformable materials with unknown physical properties. AdaptiGraph leverages the highly flexible graph-based neural dynamics (GBND) framework, which represents material bits as particles and employs a graph neural network (GNN) to predict particle motion. Its key innovation is a unified physical property-conditioned GBND model capable of predicting the motions of diverse materials with varying physical properties without retraining. Upon encountering new materials during online deployment, AdaptiGraph utilizes a physical property optimization process for a few-shot adaptation of the model, enhancing its fit to the observed interaction data. The adapted models can precisely simulate the dynamics and predict the motion of various deformable materials, such as ropes, granular media, rigid boxes, and cloth, while adapting to different physical properties, including stiffness, granular size, and center of pressure. On prediction and manipulation tasks involving a diverse set of real-world deformable objects, our method exhibits superior prediction accuracy and task proficiency over non-material-conditioned and non-adaptive models. The project page is available at https://robopil.github.io/adaptigraph/ .

ROAug 6, 2024
Few-shot Scooping Under Domain Shift via Simulated Maximal Deployment Gaps

Yifan Zhu, Pranay Thangeda, Erica L Tevere et al.

Autonomous lander missions on extraterrestrial bodies need to sample granular materials while coping with domain shifts, even when sampling strategies are extensively tuned on Earth. To tackle this challenge, this paper studies the few-shot scooping problem and proposes a vision-based adaptive scooping strategy that uses the deep kernel Gaussian process method trained with a novel meta-training strategy to learn online from very limited experience on out-of-distribution target terrains. Our Deep Kernel Calibration with Maximal Deployment Gaps (kCMD) strategy explicitly trains a deep kernel model to adapt to large domain shifts by creating simulated maximal deployment gaps from an offline training dataset and training models to overcome these deployment gaps during training. Employed in a Bayesian Optimization sequential decision-making framework, the proposed method allows the robot to perform high-quality scooping actions on out-of-distribution terrains after a few attempts, significantly outperforming non-adaptive methods proposed in the excavation literature as well as other state-of-the-art meta-learning methods. The proposed method also demonstrates zero-shot transfer capability, successfully adapting to the NASA OWLAT platform, which serves as a state-of-the-art simulator for potential future planetary missions. These results demonstrate the potential of training deep models with simulated deployment gaps for more generalizable meta-learning in high-capacity models. Furthermore, they highlight the promise of our method in autonomous lander sampling missions by enabling landers to overcome the deployment gap between Earth and extraterrestrial bodies.

CVNov 16, 2023
On the Overconfidence Problem in Semantic 3D Mapping

Joao Marcos Correia Marques, Albert Zhai, Shenlong Wang et al.

Semantic 3D mapping, the process of fusing depth and image segmentation information between multiple views to build 3D maps annotated with object classes in real-time, is a recent topic of interest. This paper highlights the fusion overconfidence problem, in which conventional mapping methods assign high confidence to the entire map even when they are incorrect, leading to miscalibrated outputs. Several methods to improve uncertainty calibration at different stages in the fusion pipeline are presented and compared on the ScanNet dataset. We show that the most widely used Bayesian fusion strategy is among the worst calibrated, and propose a learned pipeline that combines fusion and calibration, GLFS, which achieves simultaneously higher accuracy and 3D map calibration while retaining real-time capability. We further illustrate the importance of map calibration on a downstream task by showing that incorporating proper semantic fusion on a modular ObjectNav agent improves its success rates. Our code will be provided on Github for reproducibility upon acceptance.

RODec 26, 2025
Flexible Multitask Learning with Factorized Diffusion Policy

Chaoqi Liu, Haonan Chen, Sigmund H. Høeg et al.

Multitask learning poses significant challenges due to the highly multimodal and diverse nature of robot action distributions. However, effectively fitting policies to these complex task distributions is often difficult, and existing monolithic models often underfit the action distribution and lack the flexibility required for efficient adaptation. We introduce a novel modular diffusion policy framework that factorizes complex action distributions into a composition of specialized diffusion models, each capturing a distinct sub-mode of the behavior space for a more effective overall policy. In addition, this modular structure enables flexible policy adaptation to new tasks by adding or fine-tuning components, which inherently mitigates catastrophic forgetting. Empirically, across both simulation and real-world robotic manipulation settings, we illustrate how our method consistently outperforms strong modular and monolithic baselines.

ROFeb 23
Simulation-Ready Cluttered Scene Estimation via Physics-aware Joint Shape and Pose Optimization

Wei-Cheng Huang, Jiaheng Han, Xiaohan Ye et al.

Estimating simulation-ready scenes from real-world observations is crucial for downstream planning and policy learning tasks. Regretfully, existing methods struggle in cluttered environments, often exhibiting prohibitive computational cost, poor robustness, and restricted generality when scaling to multiple interacting objects. We propose a unified optimization-based formulation for real-to-sim scene estimation that jointly recovers the shapes and poses of multiple rigid objects under physical constraints. Our method is built on two key technical innovations. First, we leverage the recently introduced shape-differentiable contact model, whose global differentiability permits joint optimization over object geometry and pose while modeling inter-object contacts. Second, we exploit the structured sparsity of the augmented Lagrangian Hessian to derive an efficient linear system solver whose computational cost scales favorably with scene complexity. Building on this formulation, we develop an end-to-end real-to-sim scene estimation pipeline that integrates learning-based object initialization, physics-constrained joint shape-pose optimization, and differentiable texture refinement. Experiments on cluttered scenes with up to 5 objects and 22 convex hulls demonstrate that our approach robustly reconstructs physically valid, simulation-ready object shapes and poses.

ROJun 18, 2025
Particle-Grid Neural Dynamics for Learning Deformable Object Models from RGB-D Videos

Kaifeng Zhang, Baoyu Li, Kris Hauser et al.

Modeling the dynamics of deformable objects is challenging due to their diverse physical properties and the difficulty of estimating states from limited visual information. We address these challenges with a neural dynamics framework that combines object particles and spatial grids in a hybrid representation. Our particle-grid model captures global shape and motion information while predicting dense particle movements, enabling the modeling of objects with varied shapes and materials. Particles represent object shapes, while the spatial grid discretizes the 3D space to ensure spatial continuity and enhance learning efficiency. Coupled with Gaussian Splattings for visual rendering, our framework achieves a fully learning-based digital twin of deformable objects and generates 3D action-conditioned videos. Through experiments, we demonstrate that our model learns the dynamics of diverse objects -- such as ropes, cloths, stuffed animals, and paper bags -- from sparse-view RGB-D recordings of robot-object interactions, while also generalizing at the category level to unseen instances. Our approach outperforms state-of-the-art learning-based and physics-based simulators, particularly in scenarios with limited camera views. Furthermore, we showcase the utility of our learned models in model-based planning, enabling goal-conditioned object manipulation across a range of tasks. The project page is available at https://kywind.github.io/pgnd .

ROFeb 28, 2025
Map Space Belief Prediction for Manipulation-Enhanced Mapping

Joao Marcos Correia Marques, Nils Dengler, Tobias Zaenker et al.

Searching for objects in cluttered environments requires selecting efficient viewpoints and manipulation actions to remove occlusions and reduce uncertainty in object locations, shapes, and categories. In this work, we address the problem of manipulation-enhanced semantic mapping, where a robot has to efficiently identify all objects in a cluttered shelf. Although Partially Observable Markov Decision Processes~(POMDPs) are standard for decision-making under uncertainty, representing unstructured interactive worlds remains challenging in this formalism. To tackle this, we define a POMDP whose belief is summarized by a metric-semantic grid map and propose a novel framework that uses neural networks to perform map-space belief updates to reason efficiently and simultaneously about object geometries, locations, categories, occlusions, and manipulation physics. Further, to enable accurate information gain analysis, the learned belief updates should maintain calibrated estimates of uncertainty. Therefore, we propose Calibrated Neural-Accelerated Belief Updates (CNABUs) to learn a belief propagation model that generalizes to novel scenarios and provides confidence-calibrated predictions for unknown areas. Our experiments show that our novel POMDP planner improves map completeness and accuracy over existing methods in challenging simulations and successfully transfers to real-world cluttered shelves in zero-shot fashion.

ROMar 30, 2025
Localized Graph-Based Neural Dynamics Models for Terrain Manipulation

Chaoqi Liu, Yunzhu Li, Kris Hauser

Predictive models can be particularly helpful for robots to effectively manipulate terrains in construction sites and extraterrestrial surfaces. However, terrain state representations become extremely high-dimensional especially to capture fine-resolution details and when depth is unknown or unbounded. This paper introduces a learning-based approach for terrain dynamics modeling and manipulation, leveraging the Graph-based Neural Dynamics (GBND) framework to represent terrain deformation as motion of a graph of particles. Based on the principle that the moving portion of a terrain is usually localized, our approach builds a large terrain graph (potentially millions of particles) but only identifies a very small active subgraph (hundreds of particles) for predicting the outcomes of robot-terrain interaction. To minimize the size of the active subgraph we introduce a learning-based approach that identifies a small region of interest (RoI) based on the robot's control inputs and the current scene. We also introduce a novel domain boundary feature encoding that allows GBNDs to perform accurate dynamics prediction in the RoI interior while avoiding particle penetration through RoI boundaries. Our proposed method is both orders of magnitude faster than naive GBND and it achieves better overall prediction accuracy. We further evaluated our framework on excavation and shaping tasks on terrain with different granularity.

ROMar 30, 2024
Efficient Automatic Tuning for Data-driven Model Predictive Control via Meta-Learning

Baoyu Li, William Edwards, Kris Hauser

AutoMPC is a Python package that automates and optimizes data-driven model predictive control. However, it can be computationally expensive and unstable when exploring large search spaces using pure Bayesian Optimization (BO). To address these issues, this paper proposes to employ a meta-learning approach called Portfolio that improves AutoMPC's efficiency and stability by warmstarting BO. Portfolio optimizes initial designs for BO using a diverse set of configurations from previous tasks and stabilizes the tuning process by fixing initial configurations instead of selecting them randomly. Experimental results demonstrate that Portfolio outperforms the pure BO in finding desirable solutions for AutoMPC within limited computational resources on 11 nonlinear control simulation benchmarks and 1 physical underwater soft robot dataset.

ROJan 24, 2022
Automated Heart and Lung Auscultation in Robotic Physical Examinations

Yifan Zhu, Alexander Smith, Kris Hauser

This paper presents the first implementation of autonomous robotic auscultation of heart and lung sounds. To select auscultation locations that generate high-quality sounds, a Bayesian Optimization (BO) formulation leverages visual anatomical cues to predict where high-quality sounds might be located, while using auditory feedback to adapt to patient-specific anatomical qualities. Sound quality is estimated online using machine learning models trained on a database of heart and lung stethoscope recordings. Experiments on 4 human subjects show that our system autonomously captures heart and lung sounds of similar quality compared to tele-operation by a human trained in clinical auscultation. Surprisingly, one of the subjects exhibited a previously unknown cardiac pathology that was first identified using our robot, which demonstrates the potential utility of autonomous robotic auscultation for health screening.

ROMar 25, 2021
Optimized Coverage Planning for UV Surface Disinfection

Joao Marcos Correia Marques, Ramya Ramalingam, Zherong Pan et al.

UV radiation has been used as a disinfection strategy to deactivate a wide range of pathogens, but existing irradiation strategies do not ensure sufficient exposure of all environmental surfaces and/or require long disinfection times. We present a near-optimal coverage planner for mobile UV disinfection robots. The formulation optimizes the irradiation time efficiency, while ensuring that a sufficient dosage of radiation is received by each surface. The trajectory and dosage plan are optimized taking collision and light occlusion constraints into account. We propose a two-stage scheme to approximate the solution of the induced NP-hard optimization, and, for efficiency, perform key irradiance and occlusion calculations on a GPU. Empirical results show that our technique achieves more coverage for the same exposure time as strategies for existing UV robots, can be used to compare UV robot designs, and produces near-optimal plans. This is an extended version of the paper originally contributed to ICRA2021.

ROOct 28, 2020
Implicit Integration for Articulated Bodies with Contact via the Nonconvex Maximal Dissipation Principle

Zherong Pan, Kris Hauser

We present non-convex maximal dissipation principle (NMDP), a time integration scheme for articulated bodies with simultaneous contacts. Our scheme resolves contact forces via the maximal dissipation principle (MDP). Prior MDP solvers compute contact forces via convex programming by assuming linearized dynamics integrated using the forward multistep scheme. Instead, we consider the coupled system of nonlinear Newton-Euler dynamics and MDP, which is time-integrated using the backward integration scheme. We show that the coupled system of equations can be solved efficiently using the projected gradient method with guaranteed convergence. We evaluate our method by predicting several locomotion trajectories for a quadruped robot. The results show that our NMDP scheme has several desirable properties including: (1) generalization to novel contact models; (2) superior stability under large timestep sizes; (3) consistent trajectory generation under varying timestep sizes.

ROOct 20, 2020
Decision Making in Joint Push-Grasp Action Space for Large-Scale Object Sorting

Zherong Pan, Kris Hauser

We present a planner for large-scale (un)labeled object sorting tasks, which uses two types of manipulation actions: overhead grasping and planar pushing. The grasping action offers completeness guarantee under mild assumptions, and planar pushing is an acceleration strategy that moves multiple objects at once. Our main contribution is twofold: (1) We propose a bilevel planning algorithm. Our high-level planner makes efficient, near-optimal choices between pushing and grasping actions based on a cost model. Our low-level planner computes one-step greedy pushing or grasping actions. (2) We propose a novel low-level push planner that can find one-step greedy pushing actions in a semi-discrete search space. The structure of the search space allows us to efficient We show that, for sorting up to $200$ objects, our planner can find near-optimal actions with $10$ seconds of computation on a desktop machine.

ROFeb 3, 2020
Toward Autonomous Robotic Micro-Suturing using Optical Coherence Tomography Calibration and Path Planning

Yuan Tian, Mark Draelos, Gao Tang et al.

Robotic automation has the potential to assist human surgeons in performing suturing tasks in microsurgery, and in order to do so a robot must be able to guide a needle with sub-millimeter precision through soft tissue. This paper presents a robotic suturing system that uses 3D optical coherence tomography (OCT) system for imaging feedback. Calibration of the robot-OCT and robot-needle transforms, wound detection, keypoint identification, and path planning are all performed automatically. The calibration method handles pose uncertainty when the needle is grasped using a variant of iterative closest points. The path planner uses the identified wound shape to calculate needle entry and exit points to yield an evenly-matched wound shape after closure. Experiments on tissue phantoms and animal tissue demonstrate that the system can pass a suture needle through wounds with 0.27 mm overall accuracy in achieving the planned entry and exit points.

RODec 10, 2018
Stable bin packing of non-convex 3D objects with a robot manipulator

Fan Wang, Kris Hauser

Recent progress in the field of robotic manipulation has generated interest in fully automatic object packing in warehouses. This paper proposes a formulation of the packing problem that is tailored to the automated warehousing domain. Besides minimizing waste space inside a container, the problem requires stability of the object pile during packing and the feasibility of the robot motion executing the placement plans. To address this problem, a set of constraints are formulated, and a constructive packing pipeline is proposed to solve for these constraints. The pipeline is able to pack geometrically complex, non-convex objects with stability while satisfying robot constraints. In particular, a new 3D positioning heuristic called Heightmap-Minimization heuristic is proposed, and heightmaps are used to speed up the search. Experimental evaluation of the method is conducted with a realistic physical simulator on a dataset of scanned real-world items, demonstrating stable and high-quality packing plans compared with other 3D packing methods.

RONov 27, 2018
Fast UAV Trajectory Optimization using Bilevel Optimization with Analytical Gradients

Weidong Sun, Gao Tang, Kris Hauser

We present an efficient optimization framework that solves trajectory optimization problems by decoupling state variables from timing variables, thereby decomposing a challenging nonlinear programming (NLP) problem into two easier subproblems. With timing fixed, the state variables can be optimized efficiently using convex optimization, and the timing variables can be optimized in a separate NLP, which forms a bilevel optimization problem. The challenge of obtaining the gradient of the timing variables is solved by sensitivity analysis of parametric NLPs. The exact analytic gradient is computed from the dual solution as a by-product, whereas existing finite-difference techniques require additional optimization. The bilevel optimization framework efficiently optimizes both timing and state variables which is demonstrated on generating trajectories for an unmanned aerial vehicle. Numerical experiments demonstrate that bilevel optimization converges significantly more reliably than a standard NLP solver, and analytical gradients outperform finite differences in terms of computation speed and accuracy. Physical experiments demonstrate its real-time applicability for reactive target tracking tasks.

ROJul 23, 2018
Unified Multi-Contact Fall Mitigation Planning for Humanoids via Contact Transition Tree Optimization

Shihao Wang, Kris Hauser

This paper presents a multi-contact approach to generalized humanoid fall mitigation planning that unifies inertial shaping, protective stepping, and hand contact strategies. The planner optimizes both the contact sequence and the robot state trajectories. A high-level tree search is conducted to iteratively grow a contact transition tree. At each edge of the tree, trajectory optimization is used to calculate robot stabilization trajectories that produce the desired contact transition while minimizing kinetic energy. Also, at each node of the tree, the optimizer attempts to find a self-motion (inertial shaping movement) to eliminate kinetic energy. This paper also presents an efficient and effective method to generate initial seeds to facilitate trajectory optimization. Experiments demonstrate show that our proposed algorithm can generate complex stabilization strategies for a simulated robot under varying initial pushes and environment shapes.

ROMar 7, 2018
Discontinuity-Sensitive Optimal Control Learning by Mixture of Experts

Gao Tang, Kris Hauser

This paper proposes a discontinuity-sensitive approach to learn the solutions of parametric optimal control problems with high accuracy. Many tasks, ranging from model predictive control to reinforcement learning, may be solved by learning optimal solutions as a function of problem parameters. However, nonconvexity, discrete homotopy classes, and control switching cause discontinuity in the parameter-solution mapping, thus making learning difficult for traditional continuous function approximators. A mixture of experts (MoE) model composed of a classifier and several regressors is proposed to address such an issue. The optimal trajectories of different parameters are clustered such that in each cluster the trajectories are continuous function of problem parameters. Numerical examples on benchmark problems show that training the classifier and regressors individually outperforms joint training of MoE. With suitably chosen clusters, this approach not only achieves lower prediction error with less training data and fewer model parameters, but also leads to dramatic improvements in the reliability of trajectory tracking compared to traditional universal function approximation models (e.g., neural networks).

ROMay 16, 2016
Learning the Problem-Optimum Map: Analysis and Application to Global Optimization in Robotics

Kris Hauser

This paper describes a data-driven framework for approximate global optimization in which precomputed solutions to a sample of problems are retrieved and adapted during online use to solve novel problems. This approach has promise for real-time applications in robotics, since it can produce near-globally optimal solutions orders of magnitude faster than standard methods. This paper establishes theoretical conditions on how many and where samples are needed over the space of problems to achieve a given approximation quality. The framework is applied to solve globally optimal collision-free inverse kinematics (IK) problems, wherein large solution databases are used to produce near-optimal solutions in sub-millisecond time on a standard PC.

ROJan 21, 2016
Analysis and Observations from the First Amazon Picking Challenge

Nikolaus Correll, Kostas E. Bekris, Dmitry Berenson et al.

This paper presents a overview of the inaugural Amazon Picking Challenge along with a summary of a survey conducted among the 26 participating teams. The challenge goal was to design an autonomous robot to pick items from a warehouse shelf. This task is currently performed by human workers, and there is hope that robots can someday help increase efficiency and throughput while lowering cost. We report on a 28-question survey posed to the teams to learn about each team's background, mechanism design, perception apparatus, planning and control approach. We identify trends in this data, correlate it with each team's success in the competition, and discuss observations and lessons learned based on survey results and the authors' personal experiences during the challenge.

ROMay 15, 2015
Asymptotically Optimal Planning by Feasible Kinodynamic Planning in State-Cost Space

Kris Hauser, Yilun Zhou

This paper presents an equivalence between feasible kinodynamic planning and optimal kinodynamic planning, in that any optimal planning problem can be transformed into a series of feasible planning problems in a state-cost space whose solutions approach the optimum. This transformation gives rise to a meta-algorithm that produces an asymptotically optimal planner, given any feasible kinodynamic planner as a subroutine. The meta-algorithm is proven to be asymptotically optimal, and a formula is derived relating expected running time and solution suboptimality. It is directly applicable to a wide range of optimal planning problems because it does not resort to the use of steering functions or numerical boundary-value problem solvers. On a set of benchmark problems, it is demonstrated to perform, using the EST and RRT algorithms as subroutines, at a superior or comparable level to related planners.

AIJan 10, 2013
Artificial Intelligence Framework for Simulating Clinical Decision-Making: A Markov Decision Process Approach

Casey C. Bennett, Kris Hauser

In the modern healthcare system, rapidly expanding costs/complexity, the growing myriad of treatment options, and exploding information streams that often do not effectively reach the front lines hinder the ability to choose optimal treatment decisions over time. The goal in this paper is to develop a general purpose (non-disease-specific) computational/artificial intelligence (AI) framework to address these challenges. This serves two potential functions: 1) a simulation environment for exploring various healthcare policies, payment methodologies, etc., and 2) the basis for clinical artificial intelligence - an AI that can think like a doctor. This approach combines Markov decision processes and dynamic decision networks to learn from clinical data and develop complex plans via simulation of alternative sequential decision paths while capturing the sometimes conflicting, sometimes synergistic interactions of various components in the healthcare system. It can operate in partially observable environments (in the case of missing observations or data) by maintaining belief states about patient health status and functions as an online agent that plans and re-plans. This framework was evaluated using real patient data from an electronic health record. Such an AI framework easily outperforms the current treatment-as-usual (TAU) case-rate/fee-for-service models of healthcare (Cost per Unit Change: $189 vs. $497) while obtaining a 30-35% increase in patient outcomes. Tweaking certain model parameters further enhances this advantage, obtaining roughly 50% more improvement for roughly half the costs. Given careful design and problem formulation, an AI simulation framework can approximate optimal decisions even in complex and uncertain environments. Future work is described that outlines potential lines of research and integration of machine learning algorithms for personalized medicine.