Beatrice Demiray

IVMar 10, 2021

Are we using appropriate segmentation metrics? Identifying correlates of human expert perception for CNN training beyond rolling the DICE coefficient

Florian Kofler, Ivan Ezhov, Fabian Isensee et al.

Metrics optimized in complex machine learning tasks are often selected in an ad-hoc manner. It is unknown how they align with human expert perception. We explore the correlations between established quantitative segmentation quality metrics and qualitative evaluations by professionally trained human raters. Therefore, we conduct psychophysical experiments for two complex biomedical semantic segmentation problems. We discover that current standard metrics and loss functions correlate only moderately with the segmentation quality assessment of experts. Importantly, this effect is particularly pronounced for clinically relevant structures, such as the enhancing tumor compartment of glioma in brain magnetic resonance and grey matter in ultrasound imaging. It is often unclear how to optimize abstract metrics, such as human expert perception, in convolutional neural network (CNN) training. To cope with this challenge, we propose a novel strategy employing techniques of classical statistics to create complementary compound loss functions to better approximate human expert perception. Across all rating experiments, human experts consistently scored computer-generated segmentations better than the human-curated reference labels. Our results, therefore, strongly question many current practices in medical image segmentation and provide meaningful cues for future research.

CVApr 10, 2019

Weakly-Supervised White and Grey Matter Segmentation in 3D Brain Ultrasound

Beatrice Demiray, Julia Rackerseder, Stevica Bozhinoski et al.

Although the segmentation of brain structures in ultrasound helps initialize image based registration, assist brain shift compensation, and provides interventional decision support, the task of segmenting grey and white matter in cranial ultrasound is very challenging and has not been addressed yet. We train a multi-scale fully convolutional neural network simultaneously for two classes in order to segment real clinical 3D ultrasound data. Parallel pathways working at different levels of resolution account for high frequency speckle noise and global 3D image features. To ensure reproducibility, the publicly available RESECT dataset is utilized for training and cross-validation. Due to the absence of a ground truth, we train with weakly annotated label. We implement label transfer from MRI to US, which is prone to a residual but inevitable registration error. To further improve results, we perform transfer learning using synthetic US data. The resulting method leads to excellent Dice scores of 0.7080, 0.8402 and 0.9315 for grey matter, white matter and background. Our proposed methodology sets an unparalleled standard for white and grey matter segmentation in 3D intracranial ultrasound.

Beatrice Demiray

2 Papers