CV LGSep 29, 2023

Reconstruction of Patient-Specific Confounders in AI-based Radiologic Image Interpretation using Generative Pretraining

Tianyu Han, Laura Žigutytė, Luisa Huck, Marc Huppertz, Robert Siepmann, Yossi Gandelsman, Christian Blüthgen, Firas Khader, Christiane Kuhl, Sven Nebelung, Jakob Kather, Daniel Truhn

arXiv:2309.17123v11.57 citationsh-index: 59Has Code

Originality Incremental advance

AI Analysis

This addresses the need for reliable diagnostic assistance in healthcare by providing patient-specific insights into model confounders, representing an incremental advance in medical image analysis.

The paper tackled the problem of visualizing confounding factors in AI-based radiologic image interpretation by proposing DiffChest, a self-conditioned diffusion model trained on 515,704 chest radiographs, which achieved high inter-reader agreement (Fleiss' Kappa ≥0.8) and accurately captured confounders with prevalence rates from 11.1% to 100%.

Detecting misleading patterns in automated diagnostic assistance systems, such as those powered by Artificial Intelligence, is critical to ensuring their reliability, particularly in healthcare. Current techniques for evaluating deep learning models cannot visualize confounding factors at a diagnostic level. Here, we propose a self-conditioned diffusion model termed DiffChest and train it on a dataset of 515,704 chest radiographs from 194,956 patients from multiple healthcare centers in the United States and Europe. DiffChest explains classifications on a patient-specific level and visualizes the confounding factors that may mislead the model. We found high inter-reader agreement when evaluating DiffChest's capability to identify treatment-related confounders, with Fleiss' Kappa values of 0.8 or higher across most imaging findings. Confounders were accurately captured with 11.1% to 100% prevalence rates. Furthermore, our pretraining process optimized the model to capture the most relevant information from the input radiographs. DiffChest achieved excellent diagnostic accuracy when diagnosing 11 chest conditions, such as pleural effusion and cardiac insufficiency, and at least sufficient diagnostic accuracy for the remaining conditions. Our findings highlight the potential of pretraining based on diffusion models in medical image classification, specifically in providing insights into confounding factors and model robustness.

View on arXiv PDF Code

Similar