ROAICVHCSPApr 8, 2021

Multimodal Fusion of EMG and Vision for Human Grasp Intent Inference in Prosthetic Hand Control

arXiv:2104.03893v552 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of reliable prosthetic hand control for transradial amputees, representing an incremental advance through multimodal fusion.

The paper tackled the problem of poor grasp intent inference in prosthetic hand control by fusing EMG and vision sensor data, achieving a 13.66% and 14.8% improvement over individual modalities and an overall accuracy of 95.3%.

Objective: For transradial amputees, robotic prosthetic hands promise to regain the capability to perform daily living activities. Current control methods based on physiological signals such as electromyography (EMG) are prone to yielding poor inference outcomes due to motion artifacts, muscle fatigue, and many more. Vision sensors are a major source of information about the environment state and can play a vital role in inferring feasible and intended gestures. However, visual evidence is also susceptible to its own artifacts, most often due to object occlusion, lighting changes, etc. Multimodal evidence fusion using physiological and vision sensor measurements is a natural approach due to the complementary strengths of these modalities. Methods: In this paper, we present a Bayesian evidence fusion framework for grasp intent inference using eye-view video, eye-gaze, and EMG from the forearm processed by neural network models. We analyze individual and fused performance as a function of time as the hand approaches the object to grasp it. For this purpose, we have also developed novel data processing and augmentation techniques to train neural network components. Results: Our results indicate that, on average, fusion improves the instantaneous upcoming grasp type classification accuracy while in the reaching phase by 13.66% and 14.8%, relative to EMG (81.64% non-fused) and visual evidence (80.5% non-fused) individually, resulting in an overall fusion accuracy of 95.3%. Conclusion: Our experimental data analyses demonstrate that EMG and visual evidence show complementary strengths, and as a consequence, fusion of multimodal evidence can outperform each individual evidence modality at any given time.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes