Cross-Dataset Adaptation for Instrument Classification in Cataract Surgery Videos
This work addresses the challenge of poor generalization in surgical tool classification for cataract surgery, which is important for intra-operative and post-operative analysis, though it is incremental as it builds on existing UDA techniques.
The paper tackles the problem of domain shift in surgical tool presence detection across different cataract surgery video datasets, proposing a novel unsupervised domain adaptation method called Barlow Adaptor that improves cross-dataset performance by 6% over state-of-the-art methods.
Surgical tool presence detection is an important part of the intra-operative and post-operative analysis of a surgery. State-of-the-art models, which perform this task well on a particular dataset, however, perform poorly when tested on another dataset. This occurs due to a significant domain shift between the datasets resulting from the use of different tools, sensors, data resolution etc. In this paper, we highlight this domain shift in the commonly performed cataract surgery and propose a novel end-to-end Unsupervised Domain Adaptation (UDA) method called the Barlow Adaptor that addresses the problem of distribution shift without requiring any labels from another domain. In addition, we introduce a novel loss called the Barlow Feature Alignment Loss (BFAL) which aligns features across different domains while reducing redundancy and the need for higher batch sizes, thus improving cross-dataset performance. The use of BFAL is a novel approach to address the challenge of domain shift in cataract surgery data. Extensive experiments are conducted on two cataract surgery datasets and it is shown that the proposed method outperforms the state-of-the-art UDA methods by 6%. The code can be found at https://github.com/JayParanjape/Barlow-Adaptor