Probabilistic Integration of Object Level Annotations in Chest X-ray Classification
This addresses the challenge of efficiently using scarce expert annotations in medical imaging for improved disease classification, though it is incremental as it builds on existing methods with a novel integration approach.
The paper tackles the problem of limited medical image annotations by proposing a probabilistic latent variable model for chest X-ray disease classification, which integrates global labels and object-level annotations through a two-stage optimization algorithm, resulting in consistent classification improvements across benchmark datasets like Chest X-ray14 and MIMIC-CXR.
Medical image datasets and their annotations are not growing as fast as their equivalents in the general domain. This makes translation from the newest, more data-intensive methods that have made a large impact on the vision field increasingly more difficult and less efficient. In this paper, we propose a new probabilistic latent variable model for disease classification in chest X-ray images. Specifically we consider chest X-ray datasets that contain global disease labels, and for a smaller subset contain object level expert annotations in the form of eye gaze patterns and disease bounding boxes. We propose a two-stage optimization algorithm which is able to handle these different label granularities through a single training pipeline in a two-stage manner. In our pipeline global dataset features are learned in the lower level layers of the model. The specific details and nuances in the fine-grained expert object-level annotations are learned in the final layers of the model using a knowledge distillation method inspired by conditional variational inference. Subsequently, model weights are frozen to guide this learning process and prevent overfitting on the smaller richly annotated data subsets. The proposed method yields consistent classification improvement across different backbones on the common benchmark datasets Chest X-ray14 and MIMIC-CXR. This shows how two-stage learning of labels from coarse to fine-grained, in particular with object level annotations, is an effective method for more optimal annotation usage.