CV AI LGOct 10, 2025

A methodology for clinically driven interactive segmentation evaluation

Parhom Esmaeili, Virginia Fernandez, Pedro Borges, Eli Gibson, Sebastien Ourselin, M. Jorge Cardoso

arXiv:2510.09499v16.21 citationsh-index: 12HAIC@MICCAI

Originality Incremental advance

AI Analysis

This addresses evaluation inconsistencies for researchers and clinicians in medical image segmentation, though it is incremental as it builds on existing interactive segmentation methods.

The paper tackled inconsistent and clinically unrealistic evaluation in interactive medical image segmentation by proposing a clinically grounded methodology and software framework for standardized evaluation. They found that minimizing information loss in user interactions and adaptive-zooming mechanisms boost robustness, with performance dropping up to 15% when validation differs from training.

Interactive segmentation is a promising strategy for building robust, generalisable algorithms for volumetric medical image segmentation. However, inconsistent and clinically unrealistic evaluation hinders fair comparison and misrepresents real-world performance. We propose a clinically grounded methodology for defining evaluation tasks and metrics, and built a software framework for constructing standardised evaluation pipelines. We evaluate state-of-the-art algorithms across heterogeneous and complex tasks and observe that (i) minimising information loss when processing user interactions is critical for model robustness, (ii) adaptive-zooming mechanisms boost robustness and speed convergence, (iii) performance drops if validation prompting behaviour/budgets differ from training, (iv) 2D methods perform well with slab-like images and coarse targets, but 3D context helps with large or irregularly shaped targets, (v) performance of non-medical-domain models (e.g. SAM2) degrades with poor contrast and complex shapes.

View on arXiv PDF

Similar