CVAug 5, 2024

More Than Positive and Negative: Communicating Fine Granularity in Medical Diagnosis

arXiv:2408.02214v1h-index: 14
Originality Incremental advance
AI Analysis

This work addresses the gap between binary AI models and complex real-world medical scenarios, aiming to improve diagnostic accuracy for healthcare applications, though it is incremental in nature.

The paper tackles the problem of oversimplified binary classification in AI-based chest X-ray analysis by introducing a benchmark for fine-grained diagnosis, dividing positive cases into atypical and typical subcategories, and proposing a risk modulation method that achieves superior performance using only coarse labels.

With the advance of deep learning, much progress has been made in building powerful artificial intelligence (AI) systems for automatic Chest X-ray (CXR) analysis. Most existing AI models are trained to be a binary classifier with the aim of distinguishing positive and negative cases. However, a large gap exists between the simple binary setting and complicated real-world medical scenarios. In this work, we reinvestigate the problem of automatic radiology diagnosis. We first observe that there is considerable diversity among cases within the positive class, which means simply classifying them as positive loses many important details. This motivates us to build AI models that can communicate fine-grained knowledge from medical images like human experts. To this end, we first propose a new benchmark on fine granularity learning from medical images. Specifically, we devise a division rule based on medical knowledge to divide positive cases into two subcategories, namely atypical positive and typical positive. Then, we propose a new metric termed AUC$^\text{FG}$ on the two subcategories for evaluation of the ability to separate them apart. With the proposed benchmark, we encourage the community to develop AI diagnosis systems that could better learn fine granularity from medical images. Last, we propose a simple risk modulation approach to this problem by only using coarse labels in training. Empirical results show that despite its simplicity, the proposed method achieves superior performance and thus serves as a strong baseline.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes