LG HCFeb 6, 2023

When the Ground Truth is not True: Modelling Human Biases in Temporal Annotations

Taku Yamagata, Emma L. Tonkin, Benjamin Arana Sanchez, Ian Craddock, Miquel Perello Nieto, Raul Santos-Rodriguez, Weisong Yang, Peter Flach

arXiv:2302.02706v13.81 citationsh-index: 61

Originality Incremental advance

AI Analysis

This addresses unreliable evaluation and model performance in temporal data annotation, particularly for domains like daily activity tracking, but is incremental as it builds on existing soft label methods.

The paper tackles the problem of low-quality temporal annotations in supervised learning by modeling human biases and proposing soft labels, showing that soft labels better approximate ground truth on synthetic data and are demonstrated on a real dataset of daily activities.

In supervised learning, low quality annotations lead to poorly performing classification and detection models, while also rendering evaluation unreliable. This is particularly apparent on temporal data, where annotation quality is affected by multiple factors. For example, in the post-hoc self-reporting of daily activities, cognitive biases are one of the most common ingredients. In particular, reporting the start and duration of an activity after its finalisation may incorporate biases introduced by personal time perceptions, as well as the imprecision and lack of granularity due to time rounding. Here we propose a method to model human biases on temporal annotations and argue for the use of soft labels. Experimental results in synthetic data show that soft labels provide a better approximation of the ground truth for several metrics. We showcase the method on a real dataset of daily activities.

View on arXiv PDF

Similar