IV CV LGDec 12, 2023

On the notion of Hallucinations from the lens of Bias and Validity in Synthetic CXR Images

Gauri Bhardwaj, Yuvaraj Govindarajulu, Sundaraparipurnan Narayanan, Pavan Kulkarni, Manojkumar Parmar

arXiv:2312.06979v13.04 citationsh-index: 4Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses patient safety risks in medical AI applications by identifying hallucinations and fairness issues in synthetic medical images, though it is incremental on prior research.

The researchers examined bias and validity in synthetic chest X-ray images generated by a fine-tuned Stable Diffusion model, finding that 42% of images incorrectly indicated COVID (latent hallucinations) and that bias analysis revealed disparities, especially for the Female Hispanic subgroup.

Medical imaging has revolutionized disease diagnosis, yet the potential is hampered by limited access to diverse and privacy-conscious datasets. Open-source medical datasets, while valuable, suffer from data quality and clinical information disparities. Generative models, such as diffusion models, aim to mitigate these challenges. At Stanford, researchers explored the utility of a fine-tuned Stable Diffusion model (RoentGen) for medical imaging data augmentation. Our work examines specific considerations to expand the Stanford research question, Could Stable Diffusion Solve a Gap in Medical Imaging Data? from the lens of bias and validity of the generated outcomes. We leveraged RoentGen to produce synthetic Chest-XRay (CXR) images and conducted assessments on bias, validity, and hallucinations. Diagnostic accuracy was evaluated by a disease classifier, while a COVID classifier uncovered latent hallucinations. The bias analysis unveiled disparities in classification performance among various subgroups, with a pronounced impact on the Female Hispanic subgroup. Furthermore, incorporating race and gender into input prompts exacerbated fairness issues in the generated images. The quality of synthetic images exhibited variability, particularly in certain disease classes, where there was more significant uncertainty compared to the original images. Additionally, we observed latent hallucinations, with approximately 42% of the images incorrectly indicating COVID, hinting at the presence of hallucinatory elements. These identifications provide new research directions towards interpretability of synthetic CXR images, for further understanding of associated risks and patient safety in medical applications.

View on arXiv PDF

Similar