CVJul 16, 2017

Expected exponential loss for gaze-based video and volume ground truth annotation

arXiv:1707.04905v17 citations
Originality Incremental advance
AI Analysis

This addresses the annotation burden for medical imaging researchers and practitioners by reducing the time and expense of collecting ground truth data, though it is an incremental improvement over existing gaze-based methods.

The paper tackles the high cost of pixel-wise annotation for object segmentation in medical videos and 3D volumes by proposing a framework where annotators use an eye gaze tracker to record observations, enabling estimation of pixel-wise probabilities and training a classifier with a novel Expected Exponential loss function. The result shows superior performance across various medical image settings compared to existing strategies, with the method being compatible with crowd-sourcing paradigms.

Many recent machine learning approaches used in medical imaging are highly reliant on large amounts of image and ground truth data. In the context of object segmentation, pixel-wise annotations are extremely expensive to collect, especially in video and 3D volumes. To reduce this annotation burden, we propose a novel framework to allow annotators to simply observe the object to segment and record where they have looked at with a \$200 eye gaze tracker. Our method then estimates pixel-wise probabilities for the presence of the object throughout the sequence from which we train a classifier in semi-supervised setting using a novel Expected Exponential loss function. We show that our framework provides superior performances on a wide range of medical image settings compared to existing strategies and that our method can be combined with current crowd-sourcing paradigms as well.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes