IV CVJun 30, 2022

The (de)biasing effect of GAN-based augmentation methods on skin lesion images

Agnieszka Mikołajczyk, Sylwia Majchrowska, Sandra Carrasco Limeros

arXiv:2206.15182v114.525 citationsh-index: 12Has Code

Originality Incremental advance

AI Analysis

This addresses bias issues in medical AI for dermatology, but it is incremental as it builds on existing GAN and bias analysis methods.

The study investigated how GAN-based augmentation methods inherit and amplify biases in skin lesion images from the ISIC dataset, finding that synthetic data can strengthen spurious correlations in classification models.

New medical datasets are now more open to the public, allowing for better and more extensive research. Although prepared with the utmost care, new datasets might still be a source of spurious correlations that affect the learning process. Moreover, data collections are usually not large enough and are often unbalanced. One approach to alleviate the data imbalance is using data augmentation with Generative Adversarial Networks (GANs) to extend the dataset with high-quality images. GANs are usually trained on the same biased datasets as the target data, resulting in more biased instances. This work explored unconditional and conditional GANs to compare their bias inheritance and how the synthetic data influenced the models. We provided extensive manual data annotation of possibly biasing artifacts on the well-known ISIC dataset with skin lesions. In addition, we examined classification models trained on both real and synthetic data with counterfactual bias explanations. Our experiments showed that GANs inherited biases and sometimes even amplified them, leading to even stronger spurious correlations. Manual data annotation and synthetic images are publicly available for reproducible scientific research.

View on arXiv PDF Code

Similar