IVLGQMMLFeb 6, 2020

On the limits of cross-domain generalization in automated X-ray prediction

arXiv:2002.02497v2151 citationsHas Code
AI Analysis

This work addresses the problem of unreliable cross-domain generalization in medical imaging for researchers and practitioners, highlighting critical bottlenecks in model deployment.

The study quantified the generalization limits of automated X-ray diagnostic prediction across multiple datasets, finding that label shifts, not image shifts, are the primary issue, with models showing discrepancies between performance and agreement.

This large scale study focuses on quantifying what X-rays diagnostic prediction tasks generalize well across multiple different datasets. We present evidence that the issue of generalization is not due to a shift in the images but instead a shift in the labels. We study the cross-domain performance, agreement between models, and model representations. We find interesting discrepancies between performance and agreement where models which both achieve good performance disagree in their predictions as well as models which agree yet achieve poor performance. We also test for concept similarity by regularizing a network to group tasks across multiple datasets together and observe variation across the tasks. All code is made available online and data is publicly available: https://github.com/mlmed/torchxrayvision

Code Implementations9 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes