ROCVHCJul 14, 2025

Probabilistic Human Intent Prediction for Mobile Manipulation: An Evaluation with Human-Inspired Constraints

arXiv:2507.10131v13 citationsh-index: 14
Originality Incremental advance
AI Analysis

This work addresses the challenge of enabling safe and efficient human-robot collaboration in mobile manipulation tasks, representing an incremental advancement with specific performance gains.

The paper tackles the problem of predicting human intent for mobile manipulation robots by proposing GUIDER, a probabilistic framework with dual-phase estimation, which achieved median stability improvements of up to 39.5% in navigation and 31.4% in manipulation compared to baselines, and recognized object intent three times earlier in constrained scenarios.

Accurate inference of human intent enables human-robot collaboration without constraining human control or causing conflicts between humans and robots. We present GUIDER (Global User Intent Dual-phase Estimation for Robots), a probabilistic framework that enables a robot to estimate the intent of human operators. GUIDER maintains two coupled belief layers, one tracking navigation goals and the other manipulation goals. In the Navigation phase, a Synergy Map blends controller velocity with an occupancy grid to rank interaction areas. Upon arrival at a goal, an autonomous multi-view scan builds a local 3D cloud. The Manipulation phase combines U2Net saliency, FastSAM instance saliency, and three geometric grasp-feasibility tests, with an end-effector kinematics-aware update rule that evolves object probabilities in real-time. GUIDER can recognize areas and objects of intent without predefined goals. We evaluated GUIDER on 25 trials (five participants x five task variants) in Isaac Sim, and compared it with two baselines, one for navigation and one for manipulation. Across the 25 trials, GUIDER achieved a median stability of 93-100% during navigation, compared with 60-100% for the BOIR baseline, with an improvement of 39.5% in a redirection scenario (T5). During manipulation, stability reached 94-100% (versus 69-100% for Trajectron), with a 31.4% difference in a redirection task (T3). In geometry-constrained trials (manipulation), GUIDER recognized the object intent three times earlier than Trajectron (median remaining time to confident prediction 23.6 s vs 7.8 s). These results validate our dual-phase framework and show improvements in intent inference in both phases of mobile manipulation tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes