AILGMLDec 3, 2013

Test Set Selection using Active Information Acquisition for Predictive Models

arXiv:1312.0790v2
Originality Synthesis-oriented
AI Analysis

This work addresses test set selection for targeted populations in applications like banking and medical diagnosis, but it appears incremental as it builds on existing active learning concepts.

The paper tackles the problem of selecting a test set for predictive models by actively acquiring information from a training set under a budget constraint, applied in banking and medical diagnosis. The proposed greedy algorithms outperform baseline approaches in experiments with synthetic data.

In this paper, we consider active information acquisition when the prediction model is meant to be applied on a targeted subset of the population. The goal is to label a pre-specified fraction of customers in the target or test set by iteratively querying for information from the non-target or training set. The number of queries is limited by an overall budget. Arising in the context of two rather disparate applications- banking and medical diagnosis, we pose the active information acquisition problem as a constrained optimization problem. We propose two greedy iterative algorithms for solving the above problem. We conduct experiments with synthetic data and compare results of our proposed algorithms with few other baseline approaches. The experimental results show that our proposed approaches perform better than the baseline schemes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes