GazeLT: Visual attention-guided long-tailed disease classification in chest radiographs
This work addresses the problem of automated image interpretation for radiologists by improving classification of rare diseases, though it is incremental as it builds on existing visual attention methods.
The authors tackled long-tailed disease classification in chest radiographs by integrating human visual attention patterns into a deep learning framework, resulting in GazeLT outperforming the best long-tailed loss by 4.1% and a visual attention-based baseline by 21.7% in average accuracy on two datasets.
In this work, we present GazeLT, a human visual attention integration-disintegration approach for long-tailed disease classification. A radiologist's eye gaze has distinct patterns that capture both fine-grained and coarser level disease related information. While interpreting an image, a radiologist's attention varies throughout the duration; it is critical to incorporate this into a deep learning framework to improve automated image interpretation. Another important aspect of visual attention is that apart from looking at major/obvious disease patterns, experts also look at minor/incidental findings (few of these constituting long-tailed classes) during the course of image interpretation. GazeLT harnesses the temporal aspect of the visual search process, via an integration and disintegration mechanism, to improve long-tailed disease classification. We show the efficacy of GazeLT on two publicly available datasets for long-tailed disease classification, namely the NIH-CXR-LT (n=89237) and the MIMIC-CXR-LT (n=111898) datasets. GazeLT outperforms the best long-tailed loss by 4.1% and the visual attention-based baseline by 21.7% in average accuracy metrics for these datasets. Our code is available at https://github.com/lordmoinak1/gazelt.