CVJan 1, 2020

Low-Budget Label Query through Domain Alignment Enforcement

arXiv:2001.00238v25 citations
AI Analysis

This addresses the costly and expertise-intensive process of generating labelled data for specific requirements, offering an incremental improvement in sample selection efficiency.

The paper tackles the problem of low-budget label query by proposing a method to select a small set of samples from an unlabelled dataset for labelling to maximize classification accuracy, achieving state-of-the-art results on Unsupervised Domain Adaptation tasks and outperforming baselines across various datasets.

Deep learning revolution happened thanks to the availability of a massive amount of labelled data which have contributed to the development of models with extraordinary inference capabilities. Despite the public availability of a large quantity of datasets, to address specific requirements it is often necessary to generate a new set of labelled data. Quite often, the production of labels is costly and sometimes it requires specific know-how to be fulfilled. In this work, we tackle a new problem named low-budget label query that consists in suggesting to the user a small (low budget) set of samples to be labelled, from a completely unlabelled dataset, with the final goal of maximizing the classification accuracy on that dataset. In this work we first improve an Unsupervised Domain Adaptation (UDA) method to better align source and target domains using consistency constraints, reaching the state of the art on a few UDA tasks. Finally, using the previously trained model as reference, we propose a simple yet effective selection method based on uniform sampling of the prediction consistency distribution, which is deterministic and steadily outperforms other baselines as well as competing models on a large variety of publicly available datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes