Parameter choices in HaarPSI for IQA with medical images
This work addresses the suitability of IQA measures for medical images, which is an incremental improvement for researchers and practitioners in medical imaging and machine learning.
The authors tackled the problem of using full-reference image quality assessment (FR-IQA) measures, which are optimized for natural images, in medical image settings by optimizing parameters in HaarPSI for medical datasets, resulting in a novel setting called HaarPSI_MED that significantly improved performance (p<0.05) and demonstrated generalizability across different medical image types.
When developing machine learning models, image quality assessment (IQA) measures are a crucial component for the evaluation of obtained output images. However, commonly used full-reference IQA (FR-IQA) measures have been primarily developed and optimized for natural images. In many specialized settings, such as medical images, this poses an often overlooked problem regarding suitability. In previous studies, the FR-IQA measure HaarPSI showed promising behavior regarding generalizability. The measure is based on Haar wavelet representations and the framework allows optimization of two parameters. So far, these parameters have been aligned for natural images. Here, we optimize these parameters for two medical image data sets, a photoacoustic and a chest X-ray data set, with IQA expert ratings. We observe that they lead to similar parameter values, different to the natural image data, and are more sensitive to parameter changes. We denote the novel optimized setting as HaarPSI$_{MED}$, which improves the performance of the employed medical images significantly (p<0.05). Additionally, we include an independent CT test data set that illustrates the generalizability of HaarPSI$_{MED}$, as well as visual examples that qualitatively demonstrate the improvement. The results suggest that adapting common IQA measures within their frameworks for medical images can provide a valuable, generalizable addition to employment of more specific task-based measures.