CVAug 24, 2023

Robotic Scene Segmentation with Memory Network for Runtime Surgical Context Inference

arXiv:2308.12789v13 citationsh-index: 18Has Code
Originality Incremental advance
AI Analysis

This work addresses a domain-specific problem for robot-assisted surgery, enabling better workflow analysis and error detection, but it is incremental as it builds on existing memory network approaches.

The paper tackles the problem of runtime surgical context inference in robot-assisted surgery by improving video segmentation for infrequent classes and temporal consistency, resulting in superior segmentation performance for difficult objects like needle and thread and improved context inference on the JIGSAWS dataset.

Surgical context inference has recently garnered significant attention in robot-assisted surgery as it can facilitate workflow analysis, skill assessment, and error detection. However, runtime context inference is challenging since it requires timely and accurate detection of the interactions among the tools and objects in the surgical scene based on the segmentation of video data. On the other hand, existing state-of-the-art video segmentation methods are often biased against infrequent classes and fail to provide temporal consistency for segmented masks. This can negatively impact the context inference and accurate detection of critical states. In this study, we propose a solution to these challenges using a Space Time Correspondence Network (STCN). STCN is a memory network that performs binary segmentation and minimizes the effects of class imbalance. The use of a memory bank in STCN allows for the utilization of past image and segmentation information, thereby ensuring consistency of the masks. Our experiments using the publicly available JIGSAWS dataset demonstrate that STCN achieves superior segmentation performance for objects that are difficult to segment, such as needle and thread, and improves context inference compared to the state-of-the-art. We also demonstrate that segmentation and context inference can be performed at runtime without compromising performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes