LGMLFeb 28, 2023

Active Learning with Combinatorial Coverage

arXiv:2302.14567v17 citationsh-index: 12
Originality Incremental advance
AI Analysis

This addresses practical deployment issues in active learning for machine learning practitioners, though it appears incremental as it builds on existing active learning frameworks.

The paper tackles the problems of model reliance and sampling bias in active learning by proposing data-centric methods using combinatorial coverage, showing that these methods produce data that transfers better to new models and has competitive sampling bias compared to benchmarks.

Active learning is a practical field of machine learning that automates the process of selecting which data to label. Current methods are effective in reducing the burden of data labeling but are heavily model-reliant. This has led to the inability of sampled data to be transferred to new models as well as issues with sampling bias. Both issues are of crucial concern in machine learning deployment. We propose active learning methods utilizing combinatorial coverage to overcome these issues. The proposed methods are data-centric, as opposed to model-centric, and through our experiments we show that the inclusion of coverage in active learning leads to sampling data that tends to be the best in transferring to better performing models and has a competitive sampling bias compared to benchmark methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes