CVLGROFeb 28, 2023

Towards Surgical Context Inference and Translation to Gestures

arXiv:2302.14237v25 citationsh-index: 18
AI Analysis

This work addresses the problem of reducing manual effort and errors in gesture labeling for robot-assisted surgery, representing an incremental improvement by combining existing techniques in a novel way for this domain.

The paper tackles the labor-intensive and error-prone manual labeling of gestures in robot-assisted surgery by proposing an automated method that uses segmentation masks to infer surgical context and translate it into gesture transcripts, achieving state-of-the-art performance in recognizing needle and thread and shortening the labeling process by approximately 2.8 times.

Manual labeling of gestures in robot-assisted surgery is labor intensive, prone to errors, and requires expertise or training. We propose a method for automated and explainable generation of gesture transcripts that leverages the abundance of data for image segmentation. Surgical context is detected using segmentation masks by examining the distances and intersections between the tools and objects. Next, context labels are translated into gesture transcripts using knowledge-based Finite State Machine (FSM) and data-driven Long Short Term Memory (LSTM) models. We evaluate the performance of each stage of our method by comparing the results with the ground truth segmentation masks, the consensus context labels, and the gesture labels in the JIGSAWS dataset. Our results show that our segmentation models achieve state-of-the-art performance in recognizing needle and thread in Suturing and we can automatically detect important surgical states with high agreement with crowd-sourced labels (e.g., contact between graspers and objects in Suturing). We also find that the FSM models are more robust to poor segmentation and labeling performance than LSTMs. Our proposed method can significantly shorten the gesture labeling process (~2.8 times).

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes