Dual Invariance Self-training for Reliable Semi-supervised Surgical Phase Recognition
This work addresses the scarcity of labeled data for surgical phase recognition, an incremental improvement in semi-supervised learning for computer-assisted interventions.
The paper tackled the problem of unreliable pseudo-label assessment in semi-supervised surgical phase recognition by proposing the Dual Invariance Self-Training (DIST) framework, which improved accuracy on Cataract and Cholec80 datasets, outperforming state-of-the-art SSL and supervised baselines.
Accurate surgical phase recognition is crucial for advancing computer-assisted interventions, yet the scarcity of labeled data hinders training reliable deep learning models. Semi-supervised learning (SSL), particularly with pseudo-labeling, shows promise over fully supervised methods but often lacks reliable pseudo-label assessment mechanisms. To address this gap, we propose a novel SSL framework, Dual Invariance Self-Training (DIST), that incorporates both Temporal and Transformation Invariance to enhance surgical phase recognition. Our two-step self-training process dynamically selects reliable pseudo-labels, ensuring robust pseudo-supervision. Our approach mitigates the risk of noisy pseudo-labels, steering decision boundaries toward true data distribution and improving generalization to unseen data. Evaluations on Cataract and Cholec80 datasets show our method outperforms state-of-the-art SSL approaches, consistently surpassing both supervised and SSL baselines across various network architectures.