CVAILGDec 9, 2025

KD-OCT: Efficient Knowledge Distillation for Clinical-Grade Retinal OCT Classification

arXiv:2512.09069v1h-index: 6Has Code
Originality Incremental advance
AI Analysis

This work addresses the need for efficient, real-time diagnostic tools for age-related macular degeneration screening, though it is incremental as it applies existing distillation techniques to a specific medical domain.

The study tackled the problem of deploying computationally demanding deep learning models for retinal OCT classification in clinical settings by proposing KD-OCT, a knowledge distillation framework that compresses a high-performance teacher model into a lightweight student model, achieving near-teacher performance with substantial reductions in model size and inference time.

Age-related macular degeneration (AMD) and choroidal neovascularization (CNV)-related conditions are leading causes of vision loss worldwide, with optical coherence tomography (OCT) serving as a cornerstone for early detection and management. However, deploying state-of-the-art deep learning models like ConvNeXtV2-Large in clinical settings is hindered by their computational demands. Therefore, it is desirable to develop efficient models that maintain high diagnostic performance while enabling real-time deployment. In this study, a novel knowledge distillation framework, termed KD-OCT, is proposed to compress a high-performance ConvNeXtV2-Large teacher model, enhanced with advanced augmentations, stochastic weight averaging, and focal loss, into a lightweight EfficientNet-B2 student for classifying normal, drusen, and CNV cases. KD-OCT employs real-time distillation with a combined loss balancing soft teacher knowledge transfer and hard ground-truth supervision. The effectiveness of the proposed method is evaluated on the Noor Eye Hospital (NEH) dataset using patient-level cross-validation. Experimental results demonstrate that KD-OCT outperforms comparable multi-scale or feature-fusion OCT classifiers in efficiency- accuracy balance, achieving near-teacher performance with substantial reductions in model size and inference time. Despite the compression, the student model exceeds most existing frameworks, facilitating edge deployment for AMD screening. Code is available at https://github.com/erfan-nourbakhsh/KD- OCT.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes