IV CV MMJan 6, 2025

Ultrasound-QBench: Can LLMs Aid in Quality Assessment of Ultrasound Imaging?

Hongyi Miao, Jun Jia, Yankun Cao, Yingjie Zhou, Yanwei Jiang, Zhi Liu, Guangtao Zhai

arXiv:2501.02751v113.44 citationsh-index: 27Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the need for automated quality assessment in ultrasound imaging to assist clinicians, but it is incremental as it primarily establishes a benchmark rather than proposing a new method.

The authors tackled the problem of low-quality ultrasound imaging by introducing Ultrasound-QBench, a benchmark to evaluate multimodal large language models (MLLMs) on quality assessment tasks, using datasets with 7,709 and 3,863 images annotated into three quality levels, and found that MLLMs show preliminary capabilities for this task.

With the dramatic upsurge in the volume of ultrasound examinations, low-quality ultrasound imaging has gradually increased due to variations in operator proficiency and imaging circumstances, imposing a severe burden on diagnosis accuracy and even entailing the risk of restarting the diagnosis in critical cases. To assist clinicians in selecting high-quality ultrasound images and ensuring accurate diagnoses, we introduce Ultrasound-QBench, a comprehensive benchmark that systematically evaluates multimodal large language models (MLLMs) on quality assessment tasks of ultrasound images. Ultrasound-QBench establishes two datasets collected from diverse sources: IVUSQA, consisting of 7,709 images, and CardiacUltraQA, containing 3,863 images. These images encompassing common ultrasound imaging artifacts are annotated by professional ultrasound experts and classified into three quality levels: high, medium, and low. To better evaluate MLLMs, we decompose the quality assessment task into three dimensionalities: qualitative classification, quantitative scoring, and comparative assessment. The evaluation of 7 open-source MLLMs as well as 1 proprietary MLLMs demonstrates that MLLMs possess preliminary capabilities for low-level visual tasks in ultrasound image quality classification. We hope this benchmark will inspire the research community to delve deeper into uncovering and enhancing the untapped potential of MLLMs for medical imaging tasks.

View on arXiv PDF

Similar