IVCVOct 21, 2023

Leveraging Complementary Attention maps in vision transformers for OCT image analysis

arXiv:2310.14005v3h-index: 2
Originality Incremental advance
AI Analysis

This provides an incremental improvement for ophthalmology by enhancing automated screening pipelines for retinal defects.

The authors tackled automated biomarker detection in OCT retinal scans by ensembling a hybrid vision transformer (MaxViT) for local features and a standard vision transformer (EVA-02) for global features, achieving a patient-wise F1 score of 0.8527 and winning a competition, then used knowledge distillation to train a single model that outperformed the ensemble with lower computational cost.

Optical Coherence Tomography (OCT) scan yields all possible cross-section images of a retina for detecting biomarkers linked to optical defects. Due to the high volume of data generated, an automated and reliable biomarker detection pipeline is necessary as a primary screening stage. We outline our new state-of-the-art pipeline for identifying biomarkers from OCT scans. In collaboration with trained ophthalmologists, we identify local and global structures in biomarkers. Through a comprehensive and systematic review of existing vision architectures, we evaluate different convolution and attention mechanisms for biomarker detection. We find that MaxViT, a hybrid vision transformer combining convolution layers with strided attention, is better suited for local feature detection, while EVA-02, a standard vision transformer leveraging pure attention and large-scale knowledge distillation, excels at capturing global features. We ensemble the predictions of both models to achieve first place in the IEEE Video and Image Processing Cup 2023 competition on OCT biomarker detection, achieving a patient-wise F1 score of 0.8527 in the final phase of the competition, scoring 3.8\% higher than the next best solution. Finally, we used knowledge distillation to train a single MaxViT to outperform our ensemble at a fraction of the computation cost.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes