CVLGJul 29, 2021

Semi-Supervised Active Learning with Temporal Output Discrepancy

arXiv:2107.14153v188 citations
AI Analysis

This work addresses the problem of reducing annotation costs for machine learning practitioners, offering an incremental improvement in active learning efficiency and flexibility.

The paper tackles the high cost of data annotation in deep learning by proposing a novel active learning approach that selects informative unlabeled samples based on Temporal Output Discrepancy (TOD), which estimates sample loss by evaluating output discrepancies across optimization steps, achieving superior performance over state-of-the-art methods on image classification and semantic segmentation tasks.

While deep learning succeeds in a wide range of tasks, it highly depends on the massive collection of annotated data which is expensive and time-consuming. To lower the cost of data annotation, active learning has been proposed to interactively query an oracle to annotate a small proportion of informative samples in an unlabeled dataset. Inspired by the fact that the samples with higher loss are usually more informative to the model than the samples with lower loss, in this paper we present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss. The core of our approach is a measurement Temporal Output Discrepancy (TOD) that estimates the sample loss by evaluating the discrepancy of outputs given by models at different optimization steps. Our theoretical investigation shows that TOD lower-bounds the accumulated sample loss thus it can be used to select informative unlabeled samples. On basis of TOD, we further develop an effective unlabeled data sampling strategy as well as an unsupervised learning criterion that enhances model performance by incorporating the unlabeled data. Due to the simplicity of TOD, our active learning approach is efficient, flexible, and task-agnostic. Extensive experimental results demonstrate that our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes