GAN-based Data Augmentation for Chest X-ray Classification
This work addresses data scarcity and class imbalance in medical imaging, which can improve diagnostic AI systems, though it is incremental as it applies an existing GAN method to a new dataset.
The paper tackled the problem of limited and imbalanced training data in medical image classification by using GAN-based data augmentation on the CheXpert chest X-ray dataset, resulting in higher performance for underrepresented classes, especially in low-data scenarios.
A common problem in computer vision -- particularly in medical applications -- is a lack of sufficiently diverse, large sets of training data. These datasets often suffer from severe class imbalance. As a result, networks often overfit and are unable to generalize to novel examples. Generative Adversarial Networks (GANs) offer a novel method of synthetic data augmentation. In this work, we evaluate the use of GAN- based data augmentation to artificially expand the CheXpert dataset of chest radiographs. We compare performance to traditional augmentation and find that GAN-based augmentation leads to higher downstream performance for underrepresented classes. Furthermore, we see that this result is pronounced in low data regimens. This suggests that GAN-based augmentation a promising area of research to improve network performance when data collection is prohibitively expensive.