CVSep 25, 2023

Dynamic Scene Graph Representation for Surgical Video

Felix Holm, Ghazal Ghazaei, Tobias Czempiel, Ege Özsoy, Stefan Saur, Nassir Navab

arXiv:2309.14538v215.329 citationsh-index: 94

Originality Synthesis-oriented

AI Analysis

This work addresses the need for more explainable and robust models in clinical settings by providing a holistic representation of surgical videos, though it is incremental as it builds on existing scene graph and GCN methods.

The authors tackled the problem of automated surgical workflow understanding by proposing dynamic scene graph representations for surgical videos, achieving competitive performance in surgical workflow recognition tasks.

Surgical videos captured from microscopic or endoscopic imaging devices are rich but complex sources of information, depicting different tools and anatomical structures utilized during an extended amount of time. Despite containing crucial workflow information and being commonly recorded in many procedures, usage of surgical videos for automated surgical workflow understanding is still limited. In this work, we exploit scene graphs as a more holistic, semantically meaningful and human-readable way to represent surgical videos while encoding all anatomical structures, tools, and their interactions. To properly evaluate the impact of our solutions, we create a scene graph dataset from semantic segmentations from the CaDIS and CATARACTS datasets. We demonstrate that scene graphs can be leveraged through the use of graph convolutional networks (GCNs) to tackle surgical downstream tasks such as surgical workflow recognition with competitive performance. Moreover, we demonstrate the benefits of surgical scene graphs regarding the explainability and robustness of model decisions, which are crucial in the clinical setting.

View on arXiv PDF

Similar