LG MLFeb 13, 2019

Efficient Cross-Validation for Semi-Supervised Learning

Yong Liu, Jian Li, Guangjun Wu, Lizhong Ding, Weiping Wang

arXiv:1902.04768v11.81 citations

Originality Incremental advance

AI Analysis

This work addresses efficiency issues in semi-supervised learning for practitioners, but it is incremental as it builds on existing manifold regularization techniques.

The paper tackles the high computational cost of cross-validation for hyperparameter selection in manifold regularization methods like LapRLS and LapSVM by proposing an approximate CV method based on Bouligand influence function, which reduces training to a single run while maintaining statistical accuracy with no discrepancy and much lower time cost.

Manifold regularization, such as laplacian regularized least squares (LapRLS) and laplacian support vector machine (LapSVM), has been widely used in semi-supervised learning, and its performance greatly depends on the choice of some hyper-parameters. Cross-validation (CV) is the most popular approach for selecting the optimal hyper-parameters, but it has high complexity due to multiple times of learner training. In this paper, we provide a method to approximate the CV for manifold regularization based on a notion of robust statistics, called Bouligand influence function (BIF). We first provide a strategy for approximating the CV via the Taylor expansion of BIF. Then, we show how to calculate the BIF for general loss function,and further give the approximate CV criteria for model selection in manifold regularization. The proposed approximate CV for manifold regularization requires training only once, hence can significantly improve the efficiency of traditional CV. Experimental results show that our approximate CV has no statistical discrepancy with the original one, but much smaller time cost.

View on arXiv PDF

Similar