A multimodal slice discovery framework for systematic failure detection and explanation in medical image classification
This addresses safety and reliability concerns in medical AI systems, though it appears incremental as an extension of existing slice discovery methods to multimodal data.
The researchers tackled the problem of detecting systematic failures in medical image classifiers by developing the first automated auditing framework that extends slice discovery methods to multimodal representations, demonstrating strong capability in failure discovery and explanation generation on the MIMIC-CXR-JPG dataset.
Despite advances in machine learning-based medical image classifiers, the safety and reliability of these systems remain major concerns in practical settings. Existing auditing approaches mainly rely on unimodal features or metadata-based subgroup analyses, which are limited in interpretability and often fail to capture hidden systematic failures. To address these limitations, we introduce the first automated auditing framework that extends slice discovery methods to multimodal representations specifically for medical applications. Comprehensive experiments were conducted under common failure scenarios using the MIMIC-CXR-JPG dataset, demonstrating the framework's strong capability in both failure discovery and explanation generation. Our results also show that multimodal information generally allows more comprehensive and effective auditing of classifiers, while unimodal variants beyond image-only inputs exhibit strong potential in scenarios where resources are constrained.