Convolutional Nets Versus Vision Transformers for Diabetic Foot Ulcer Classification
This work addresses diabetic foot ulcer classification for medical imaging, but it is incremental as it compares existing methods with a known optimization technique.
This paper tackled the problem of classifying diabetic foot ulcers by comparing Convolutional Neural Networks (CNNs) and Vision Transformers, finding that CNNs combined with Sharpness-Aware Minimization (SAM) optimization achieved superior performance, winning first place in the DFUC 2021 Grand-Challenge.
This paper compares well-established Convolutional Neural Networks (CNNs) to recently introduced Vision Transformers for the task of Diabetic Foot Ulcer Classification, in the context of the DFUC 2021 Grand-Challenge, in which this work attained the first position. Comprehensive experiments demonstrate that modern CNNs are still capable of outperforming Transformers in a low-data regime, likely owing to their ability for better exploiting spatial correlations. In addition, we empirically demonstrate that the recent Sharpness-Aware Minimization (SAM) optimization algorithm considerably improves the generalization capability of both kinds of models. Our results demonstrate that for this task, the combination of CNNs and the SAM optimization process results in superior performance than any other of the considered approaches.