Benchmarking Deep Learning-Based Reconstruction Methods for Photoacoustic Computed Tomography with Clinically Relevant Synthetic Datasets

Panpan Chen, Seonyeong Park, Gangwon Jeong, Refik Mert Cam, Umberto Villa, Mark A. Anastasio

arXiv:2601.17165v11.22 citationsh-index: 8Has Code

Originality Synthesis-oriented

AI Analysis

This addresses reproducibility and clinical relevance issues for researchers in medical imaging, though it is incremental as it provides a framework rather than a new method.

The authors tackled the lack of standardized evaluation for deep learning-based reconstruction methods in photoacoustic computed tomography by proposing a benchmarking framework with synthetic datasets and task-based metrics, revealing that some methods performed well on traditional metrics but failed to accurately recover lesions.

Deep learning (DL)-based image reconstruction methods for photoacoustic computed tomography (PACT) have developed rapidly in recent years. However, most existing methods have not employed standardized datasets, and their evaluations rely on traditional image quality (IQ) metrics that may lack clinical relevance. The absence of a standardized framework for clinically meaningful IQ assessment hinders fair comparison and raises concerns about the reproducibility and reliability of reported advancements in PACT. A benchmarking framework is proposed that provides open-source, anatomically plausible synthetic datasets and evaluation strategies for DL-based acoustic inversion methods in PACT. The datasets each include over 11,000 two-dimensional (2D) stochastic breast objects with clinically relevant lesions and paired measurements at varying modeling complexity. The evaluation strategies incorporate both traditional and task-based IQ measures to assess fidelity and clinical utility. A preliminary benchmarking study is conducted to demonstrate the framework's utility by comparing DL-based and physics-based reconstruction methods. The benchmarking study demonstrated that the proposed framework enabled comprehensive, quantitative comparisons of reconstruction performance and revealed important limitations in certain DL-based methods. Although they performed well according to traditional IQ measures, they often failed to accurately recover lesions. This highlights the inadequacy of traditional metrics and motivates the need for task-based assessments. The proposed benchmarking framework enables systematic comparisons of DL-based acoustic inversion methods for 2D PACT. By integrating clinically relevant synthetic datasets with rigorous evaluation protocols, it enables reproducible, objective assessments and facilitates method development and system optimization in PACT.

View on arXiv PDF

Similar