Quantitative Metrics for Benchmarking Medical Image Harmonization
This work addresses the challenge of evaluating harmonization methods in medical imaging, where standardized datasets are scarce, by providing quantitative metrics for benchmarking.
The authors tackled the problem of benchmarking medical image harmonization techniques by proposing three new metrics that do not require ground truths, demonstrating their correlation with established image quality metrics on a dataset with available ground truth.
Image harmonization is an important preprocessing strategy to address domain shifts arising from data acquired using different machines and scanning protocols in medical imaging. However, benchmarking the effectiveness of harmonization techniques has been a challenge due to the lack of widely available standardized datasets with ground truths. In this context, we propose three metrics: two intensity harmonization metrics and one anatomy preservation metric for medical images during harmonization, where no ground truths are required. Through extensive studies on a dataset with available harmonization ground truth, we demonstrate that our metrics are correlated with established image quality assessment metrics. We show how these novel metrics may be applied to real-world scenarios where no harmonization ground truth exists. Additionally, we provide insights into different interpretations of the metric values, shedding light on their significance in the context of the harmonization process. As a result of our findings, we advocate for the adoption of these quantitative harmonization metrics as a standard for benchmarking the performance of image harmonization techniques.