IVAICVApr 20, 2025

Enhancing DR Classification with Swin Transformer and Shifted Window Attention

arXiv:2504.15317v13 citationsh-index: 6AIME
Originality Synthesis-oriented
AI Analysis

This work addresses early detection of diabetic retinopathy for clinical screening, but it is incremental as it applies an existing method to a specific domain with enhancements.

The paper tackled automated diabetic retinopathy classification by proposing a preprocessing pipeline and using the Swin Transformer, achieving accuracy rates of 89.65% on Aptos and 97.40% on IDRiD datasets.

Diabetic retinopathy (DR) is a leading cause of blindness worldwide, underscoring the importance of early detection for effective treatment. However, automated DR classification remains challenging due to variations in image quality, class imbalance, and pixel-level similarities that hinder model training. To address these issues, we propose a robust preprocessing pipeline incorporating image cropping, Contrast-Limited Adaptive Histogram Equalization (CLAHE), and targeted data augmentation to improve model generalization and resilience. Our approach leverages the Swin Transformer, which utilizes hierarchical token processing and shifted window attention to efficiently capture fine-grained features while maintaining linear computational complexity. We validate our method on the Aptos and IDRiD datasets for multi-class DR classification, achieving accuracy rates of 89.65% and 97.40%, respectively. These results demonstrate the effectiveness of our model, particularly in detecting early-stage DR, highlighting its potential for improving automated retinal screening in clinical settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes