CVJul 7, 2024

Self-supervised Learning via Cluster Distance Prediction for Operating Room Context Awareness

arXiv:2407.05448v11 citationsh-index: 33
Originality Incremental advance
AI Analysis

This work addresses the scalability issue of supervised methods for surgical systems by enabling more efficient scene understanding in operating rooms, though it is incremental as it builds on existing self-supervised techniques.

The authors tackled the problem of reducing annotation needs for semantic segmentation and activity classification in operating rooms by proposing a 3D self-supervised task that predicts relative distances of image patches using depth maps, resulting in noteworthy performance improvements, especially with limited data.

Semantic segmentation and activity classification are key components to creating intelligent surgical systems able to understand and assist clinical workflow. In the Operating Room, semantic segmentation is at the core of creating robots aware of clinical surroundings, whereas activity classification aims at understanding OR workflow at a higher level. State-of-the-art semantic segmentation and activity recognition approaches are fully supervised, which is not scalable. Self-supervision can decrease the amount of annotated data needed. We propose a new 3D self-supervised task for OR scene understanding utilizing OR scene images captured with ToF cameras. Contrary to other self-supervised approaches, where handcrafted pretext tasks are focused on 2D image features, our proposed task consists of predicting the relative 3D distance of image patches by exploiting the depth maps. Learning 3D spatial context generates discriminative features for our downstream tasks. Our approach is evaluated on two tasks and datasets containing multi-view data captured from clinical scenarios. We demonstrate a noteworthy improvement of performance on both tasks, specifically on low-regime data where utility of self-supervised learning is the highest.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes