TAPE: A two-stage parameter-efficient adaptation framework for foundation models in OCT-OCTA analysis
This work addresses automated ophthalmic diagnosis in resource-constrained clinical settings, representing an incremental improvement by adapting existing foundation models with a novel fine-tuning strategy.
The paper tackled the problem of domain shift and task misalignment in adapting foundation models for OCT-OCTA image segmentation, proposing TAPE, a two-stage parameter-efficient adaptation framework that achieved state-of-the-art generalization performance across diverse pathologies.
Automated analysis of optical coherence tomography (OCT) and OCT angiography (OCTA) images is critical for robust ophthalmic diagnosis. Existing mainstream methods trained from scratch rely heavily on massive data and model scale, thereby hindering their practical deployment in resource-constrained clinical settings. Although transfer learning based on foundation models (FMs) is promising, it still faces significant challenges: domain shift and task misalignment. To address these, we propose TAPE: A Two-stage Adaptation Framework via Parameter-Efficient Fine-tuning, which strategically decouples adaptation into domain alignment and task fitting for downstream segmentation. The domain adaptation stage notably applies parameter-efficient fine-tuning (PEFT) in the context of masked image modeling for medical image domain adaptation, a novel approach to the best of our knowledge. Applying TAPE to retinal layer segmentation on both universal (masked auto-encoder, MAE) and specialized (RETFound) FMs, it demonstrates superior parameter efficiency and achieves state-of-the-art generalization performance across diverse pathologies.