InfoMask: Masked Variational Latent Representation to Localize Chest Disease
This addresses the challenge of limited annotated medical images for disease localization, offering a weakly supervised solution that improves accuracy in a specific medical domain.
The paper tackles the problem of localizing pneumonia in chest X-rays without pixel-level annotations by proposing a learned spatial masking mechanism to filter noisy attention maps, resulting in more accurate localization of discriminatory regions as tested on the ChestX-ray8 dataset.
The scarcity of richly annotated medical images is limiting supervised deep learning based solutions to medical image analysis tasks, such as localizing discriminatory radiomic disease signatures. Therefore, it is desirable to leverage unsupervised and weakly supervised models. Most recent weakly supervised localization methods apply attention maps or region proposals in a multiple instance learning formulation. While attention maps can be noisy, leading to erroneously highlighted regions, it is not simple to decide on an optimal window/bag size for multiple instance learning approaches. In this paper, we propose a learned spatial masking mechanism to filter out irrelevant background signals from attention maps. The proposed method minimizes mutual information between a masked variational representation and the input while maximizing the information between the masked representation and class labels. This results in more accurate localization of discriminatory regions. We tested the proposed model on the ChestX-ray8 dataset to localize pneumonia from chest X-ray images without using any pixel-level or bounding-box annotations.