CVMay 29

CoFiDA-M: Concept-Aware Feature Modulation for Cross-Domain Adaptation with Image-Only Inference

Nurjahan Sultana, Moi Hoon Yap, Xinqi Fan, Wenqi Lu

arXiv:2605.3159169.1Has Code

AI Analysis

This work provides a practical solution for improving the robustness and real-world deployability of AI-based skin cancer screening tools for dermatologists and patients, by enabling models to adapt to consumer-grade images without requiring concept metadata at inference time.

The paper addresses the performance drop of AI models for skin cancer screening when transitioning from expert dermoscopic images to consumer-grade clinical images. They propose CoFiDA-M, a framework that leverages concept probabilities from foundation models during training to guide a feature modulator, and then distills this concept-aware representation into a lightweight, image-only student model. This approach significantly outperforms state-of-the-art methods on a multi-dataset benchmark, particularly in melanoma recall.

Models for AI-based skin cancer screening suffer a severe performance drop when shifting from expert dermoscopic (source) images to consumer-grade clinical (target) images, hindering real-world deployment. Existing domain adaptation methods often ignore crucial semantic invariants, such as clinical concepts. While new foundation models like MONET can provide this semantic information as dense, probabilistic scores, this metadata is unavailable at test time, creating a deployment paradox for practical image-only screening tools. We address this gap by proposing CoFiDA-M, a privileged information framework that learns from concepts at training time but deploys as an image-only model. Our method trains a teacher network that uses MONET concept probabilities to guide a FiLM modulator, transforming visual features into a semantically ``edited" feature space. A lightweight, image-only student is then trained to reproduce this edited representation, not just the teacher's final predictions. This distillation ``bakes" the clinical reasoning into the student's weights. On a challenging multi-dataset benchmark, our image-only student significantly outperforms state-of-the-art approaches, especially in melanoma recall. Our work provides a practical and generalizable framework for leveraging noisy, probabilistic metadata as privileged information, demonstrating strong cross-dataset robustness and potential for real-world deployment beyond dermatology. Implementation code is available at: https://github.com/mmu-dermatology-research/CoFiDA.git

View on arXiv PDF Code

Similar