Transformer with Selective Shuffled Position Embedding and Key-Patch Exchange Strategy for Early Detection of Knee Osteoarthritis
This work addresses the challenge of limited labeled data in medical imaging for older individuals with knee osteoarthritis, though it appears incremental as it builds on existing Vision Transformer and data augmentation techniques.
The paper tackled the problem of insufficient labeled medical data for early detection of knee osteoarthritis by proposing a novel data augmentation method using Vision Transformer with selective shuffled position embedding and key-patch exchange strategies, resulting in a notable improvement in classification performance for distinguishing KL-0 vs KL-2 grades.
Knee OsteoArthritis (KOA) is a widespread musculoskeletal disorder that can severely impact the mobility of older individuals. Insufficient medical data presents a significant obstacle for effectively training models due to the high cost associated with data labelling. Currently, deep learning-based models extensively utilize data augmentation techniques to improve their generalization ability and alleviate overfitting. However, conventional data augmentation techniques are primarily based on the original data and fail to introduce substantial diversity to the dataset. In this paper, we propose a novel approach based on the Vision Transformer (ViT) model with original Selective Shuffled Position Embedding (SSPE) and key-patch exchange strategies to obtain different input sequences as a method of data augmentation for early detection of KOA (KL-0 vs KL-2). More specifically, we fix and shuffle the position embedding of key and non-key patches, respectively. Then, for the target image, we randomly select other candidate images from the training set to exchange their key patches and thus obtain different input sequences. Finally, a hybrid loss function is developed by incorporating multiple loss functions for different types of the sequences. According to the experimental results, the generated data are considered valid as they lead to a notable improvement in the model's classification performance.