IV CVNov 3, 2023

Simulation of acquisition shifts in T2 Flair MR images to stress test AI segmentation networks

Christiane Posselt, Mehmet Yigit Avci, Mehmet Yigitsoy, Patrick Schünke, Christoph Kolbitsch, Tobias Schäffter, Stefanie Remmele

arXiv:2311.01894v13.0h-index: 9

Originality Incremental advance

AI Analysis

This addresses robustness issues in clinical neuroimaging AI for tasks like MS lesion segmentation, though it is incremental as it builds on existing simulation and stress testing concepts.

The authors tackled the problem of AI segmentation networks being vulnerable to acquisition shifts in T2 FLAIR MRI by developing a simulation framework to stress test these networks, showing that F1 score dependencies on echo time and inversion time can be modeled with high accuracy (R^2 > 0.9) and that echo time changes have more impact on performance.

Purpose: To provide a simulation framework for routine neuroimaging test data, which allows for "stress testing" of deep segmentation networks against acquisition shifts that commonly occur in clinical practice for T2 weighted (T2w) fluid attenuated inversion recovery (FLAIR) Magnetic Resonance Imaging (MRI) protocols. Approach: The approach simulates "acquisition shift derivatives" of MR images based on MR signal equations. Experiments comprise the validation of the simulated images by real MR scans and example stress tests on state-of-the-art MS lesion segmentation networks to explore a generic model function to describe the F1 score in dependence of the contrast-affecting sequence parameters echo time (TE) and inversion time (TI). Results: The differences between real and simulated images range up to 19 % in gray and white matter for extreme parameter settings. For the segmentation networks under test the F1 score dependency on TE and TI can be well described by quadratic model functions (R^2 > 0.9). The coefficients of the model functions indicate that changes of TE have more influence on the model performance than TI. Conclusions: We show that these deviations are in the range of values as may be caused by erroneous or individual differences of relaxation times as described by literature. The coefficients of the F1 model function allow for quantitative comparison of the influences of TE and TI. Limitations arise mainly from tissues with the low baseline signal (like CSF) and when the protocol contains contrast-affecting measures that cannot be modelled due to missing information in the DICOM header.

View on arXiv PDF

Similar