MLLGMay 29, 2019

The Label Complexity of Active Learning from Observational Data

arXiv:1905.12791v210 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of reducing label complexity in counterfactual learning for machine learning practitioners, though it is incremental as it builds on existing active learning and counterfactual risk minimization techniques.

The paper tackles the problem of active learning from observational data by proposing a new algorithm that integrates a counterfactual risk minimizer, resulting in a statistically consistent method with improved label efficiency compared to prior work.

Counterfactual learning from observational data involves learning a classifier on an entire population based on data that is observed conditioned on a selection policy. This work considers this problem in an active setting, where the learner additionally has access to unlabeled examples and can choose to get a subset of these labeled by an oracle. Prior work on this problem uses disagreement-based active learning, along with an importance weighted loss estimator to account for counterfactuals, which leads to a high label complexity. We show how to instead incorporate a more efficient counterfactual risk minimizer into the active learning algorithm. This requires us to modify both the counterfactual risk to make it amenable to active learning, as well as the active learning process to make it amenable to the risk. We provably demonstrate that the result of this is an algorithm which is statistically consistent as well as more label-efficient than prior work.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes