ML AI LGFeb 25, 2018

Teacher Improves Learning by Selecting a Training Subset

Yuzhe Ma, Robert Nowak, Philippe Rigollet, Xuezhou Zhang, Xiaojin Zhu

arXiv:1802.08946v18.715 citations

Originality Incremental advance

AI Analysis

This work addresses data efficiency in machine learning by enabling better learning with less data, though it is incremental as it builds on existing subset selection methods.

The paper tackles the problem of improving learning by selecting a subset of training data, showing that a teacher can trim an iid set to enhance performance, with sharp guarantees for Gaussian mean estimation and 1D large margin classifiers, and empirical results demonstrating effectiveness in regression and classification.

We call a learner super-teachable if a teacher can trim down an iid training set while making the learner learn even better. We provide sharp super-teaching guarantees on two learners: the maximum likelihood estimator for the mean of a Gaussian, and the large margin classifier in 1D. For general learners, we provide a mixed-integer nonlinear programming-based algorithm to find a super teaching set. Empirical experiments show that our algorithm is able to find good super-teaching sets for both regression and classification problems.

View on arXiv PDF

Similar