Segmentation Similarity and Agreement
This work addresses the need for better evaluation metrics in segmentation tasks, particularly for comparing human and automatic segmenters, but it appears incremental as it builds on existing edit distance and agreement concepts.
The authors tackled the problem of evaluating segmentation quality by proposing a new metric, segmentation similarity (S), which measures boundary alignment using edit distance, and adapted inter-annotator agreement coefficients for segmentation tasks, showing it improves upon state-of-the-art methods.
We propose a new segmentation evaluation metric, called segmentation similarity (S), that quantifies the similarity between two segmentations as the proportion of boundaries that are not transformed when comparing them using edit distance, essentially using edit distance as a penalty function and scaling penalties by segmentation size. We propose several adapted inter-annotator agreement coefficients which use S that are suitable for segmentation. We show that S is configurable enough to suit a wide variety of segmentation evaluations, and is an improvement upon the state of the art. We also propose using inter-annotator agreement coefficients to evaluate automatic segmenters in terms of human performance.