MLFeb 1, 2018

Greedy Active Learning Algorithm for Logistic Regression Models

arXiv:1802.00243v13 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of reducing data labeling costs and model complexity in binary classification for machine learning practitioners, but it is incremental as it builds on existing active learning and variable selection methods.

The authors tackled the problem of active learning for logistic regression by proposing an algorithm that performs batch subject selection and greedy variable selection simultaneously, resulting in competitive performance with smaller training sizes and more compact models compared to using all variables and full datasets.

We study a logistic model-based active learning procedure for binary classification problems, in which we adopt a batch subject selection strategy with a modified sequential experimental design method. Moreover, accompanying the proposed subject selection scheme, we simultaneously conduct a greedy variable selection procedure such that we can update the classification model with all labeled training subjects. The proposed algorithm repeatedly performs both subject and variable selection steps until a prefixed stopping criterion is reached. Our numerical results show that the proposed procedure has competitive performance, with smaller training size and a more compact model, comparing with that of the classifier trained with all variables and a full data set. We also apply the proposed procedure to a well-known wave data set (Breiman et al., 1984) to confirm the performance of our method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes