Why Deep Surgical Models Fail?: Revisiting Surgical Action Triplet Recognition through the Lens of Robustness
This work addresses the problem of low performance in surgical action recognition for surgeons, but it is incremental as it focuses on analyzing existing models rather than proposing a new solution.
The authors investigated why deep learning models underperform in surgical action triplet recognition, finding that robustness and explainability issues stem from core and spurious attributes, which could enhance model reliability.
Surgical action triplet recognition provides a better understanding of the surgical scene. This task is of high relevance as it provides the surgeon with context-aware support and safety. The current go-to strategy for improving performance is the development of new network mechanisms. However, the performance of current state-of-the-art techniques is substantially lower than other surgical tasks. Why is this happening? This is the question that we address in this work. We present the first study to understand the failure of existing deep learning models through the lens of robustness and explainability. Firstly, we study current existing models under weak and strong $δ-$perturbations via an adversarial optimisation scheme. We then analyse the failure modes via feature based explanations. Our study reveals that the key to improving performance and increasing reliability is in the core and spurious attributes. Our work opens the door to more trustworthy and reliable deep learning models in surgical data science.