IV CV LGOct 7, 2020

Evaluating the Clinical Realism of Synthetic Chest X-Rays Generated Using Progressively Growing GANs

Bradley Segal, David M. Rubin, Grace Rubin, Adam Pantanowitz

arXiv:2010.03975v249 citations

Originality Incremental advance

AI Analysis

This work addresses the need for labelled medical imaging data while protecting patient privacy, but it is incremental as it builds on existing GAN methods with a new optimization technique.

The paper tackled the problem of generating synthetic chest x-rays to augment training data for diagnostic tools, addressing patient confidentiality constraints, and found that radiologists classified synthetic images as real more often than by chance, though true realism was not yet achieved.

Chest x-rays are a vital tool in the workup of many patients. Similar to most medical imaging modalities, they are profoundly multi-modal and are capable of visualising a variety of combinations of conditions. There is an ever pressing need for greater quantities of labelled data to develop new diagnostic tools, however this is in direct opposition to concerns regarding patient confidentiality which constrains access through permission requests and ethics approvals. Previous work has sought to address these concerns by creating class-specific GANs that synthesise images to augment training data. These approaches cannot be scaled as they introduce computational trade offs between model size and class number which places fixed limits on the quality that such generates can achieve. We address this concern by introducing latent class optimisation which enables efficient, multi-modal sampling from a GAN and with which we synthesise a large archive of labelled generates. We apply a PGGAN to the task of unsupervised x-ray synthesis and have radiologists evaluate the clinical realism of the resultant samples. We provide an in depth review of the properties of varying pathologies seen on generates as well as an overview of the extent of disease diversity captured by the model. We validate the application of the Fréchet Inception Distance (FID) to measure the quality of x-ray generates and find that they are similar to other high resolution tasks. We quantify x-ray clinical realism by asking radiologists to distinguish between real and fake scans and find that generates are more likely to be classed as real than by chance, but there is still progress required to achieve true realism. We confirm these findings by evaluating synthetic classification model performance on real scans. We conclude by discussing the limitations of PGGAN generates and how to achieve controllable, realistic generates.

View on arXiv PDF

Similar