Intention estimation from gaze and motion features for human-robot shared-control object manipulation
This work addresses the need for improved teleoperation assistance in robotics, though it is incremental as it builds on existing shared-control frameworks with new feature integration.
The paper tackled the problem of robust and prompt intention estimation for human-robot shared-control object manipulation by using gaze and motion features to predict actions and target objects in simulated pick-and-place sequences, achieving good accuracy and earliness of prediction across different users and hands.
Shared control can help in teleoperated object manipulation by assisting with the execution of the user's intention. To this end, robust and prompt intention estimation is needed, which relies on behavioral observations. Here, an intention estimation framework is presented, which uses natural gaze and motion features to predict the current action and the target object. The system is trained and tested in a simulated environment with pick and place sequences produced in a relatively cluttered scene and with both hands, with possible hand-over to the other hand. Validation is conducted across different users and hands, achieving good accuracy and earliness of prediction. An analysis of the predictive power of single features shows the predominance of the grasping trigger and the gaze features in the early identification of the current action. In the current framework, the same probabilistic model can be used for the two hands working in parallel and independently, while a rule-based model is proposed to identify the resulting bimanual action. Finally, limitations and perspectives of this approach to more complex, full-bimanual manipulations are discussed.