CVLGMar 31

Better than Average: Spatially-Aware Aggregation of Segmentation Uncertainty Improves Downstream Performance

arXiv:2603.2994136.3
AI Analysis

This addresses a gap in uncertainty quantification for safety-critical domains like biomedical imaging and autonomous driving, though it is incremental as it builds on existing aggregation methods.

The paper tackles the problem of aggregating pixel-wise uncertainty scores in image segmentation into image-level scores for downstream tasks like Out-of-Distribution detection, finding that spatially-aware aggregation strategies improve performance, with a meta-aggregator achieving robust results across ten datasets.

Uncertainty Quantification (UQ) is crucial for ensuring the reliability of automated image segmentations in safety-critical domains like biomedical image analysis or autonomous driving. In segmentation, UQ generates pixel-wise uncertainty scores that must be aggregated into image-level scores for downstream tasks like Out-of-Distribution (OoD) or failure detection. Despite routine use of aggregation strategies, their properties and impact on downstream task performance have not yet been comprehensively studied. Global Average is the default choice, yet it does not account for spatial and structural features of segmentation uncertainty. Alternatives like patch-, class- and threshold-based strategies exist, but lack systematic comparison, leading to inconsistent reporting and unclear best practices. We address this gap by (1) formally analyzing properties, limitations, and pitfalls of common strategies; (2) proposing novel strategies that incorporate spatial uncertainty structure and (3) benchmarking their performance on OoD and failure detection across ten datasets that vary in image geometry and structure. We find that aggregators leveraging spatial structure yield stronger performance in both downstream tasks studied. However, the performance of individual aggregators depends heavily on dataset characteristics, so we (4) propose a meta-aggregator that integrates multiple aggregators and performs robustly across datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes