MLLGAug 14, 2020

A New Perspective on Pool-Based Active Classification and False-Discovery Control

arXiv:2008.06555v114 citations
Originality Highly original
AI Analysis

This addresses a gap in active learning for scientific settings where minimizing false discoveries is critical, offering a novel approach with theoretical guarantees.

The paper tackles the problem of adaptive experimental design for identifying regions with high true positive rate and low false discovery rate, providing the first provably sample-efficient algorithm for this task.

In many scientific settings there is a need for adaptive experimental design to guide the process of identifying regions of the search space that contain as many true positives as possible subject to a low rate of false discoveries (i.e. false alarms). Such regions of the search space could differ drastically from a predicted set that minimizes 0/1 error and accurate identification could require very different sampling strategies. Like active learning for binary classification, this experimental design cannot be optimally chosen a priori, but rather the data must be taken sequentially and adaptively. However, unlike classification with 0/1 error, collecting data adaptively to find a set with high true positive rate and low false discovery rate (FDR) is not as well understood. In this paper we provide the first provably sample efficient adaptive algorithm for this problem. Along the way we highlight connections between classification, combinatorial bandits, and FDR control making contributions to each.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes