Ensemble of ConvNeXt V2 and MaxViT for Long-Tailed CXR Classification with View-Based Aggregation
This work addresses the problem of class imbalance in medical image classification for radiologists, but it is incremental as it combines existing techniques without introducing new paradigms.
The authors tackled long-tailed classification in chest X-ray findings by using an ensemble of ConvNeXt V2 and MaxViT models with view-based aggregation, achieving 4th place in Subtask 2 and 5th in Subtask 1 of the MICCAI 2024 CXR-LT challenge.
In this work, we present our solution for the MICCAI 2024 CXR-LT challenge, achieving 4th place in Subtask 2 and 5th in Subtask 1. We leveraged an ensemble of ConvNeXt V2 and MaxViT models, pretrained on an external chest X-ray dataset, to address the long-tailed distribution of chest findings. The proposed method combines state-of-the-art image classification techniques, asymmetric loss for handling class imbalance, and view-based prediction aggregation to enhance classification performance. Through experiments, we demonstrate the advantages of our approach in improving both detection accuracy and the handling of the long-tailed distribution in CXR findings. The code is available at https://github.com/yamagishi0824/cxrlt24-multiview-pp.