LGMLFeb 13, 2019

Efficient Cross-Validation for Semi-Supervised Learning

arXiv:1902.04768v11 citations
AI Analysis

This work addresses efficiency issues in semi-supervised learning for practitioners, but it is incremental as it builds on existing manifold regularization techniques.

The paper tackles the high computational cost of cross-validation for hyperparameter selection in manifold regularization methods like LapRLS and LapSVM by proposing an approximate CV method based on Bouligand influence function, which reduces training to a single run while maintaining statistical accuracy with no discrepancy and much lower time cost.

Manifold regularization, such as laplacian regularized least squares (LapRLS) and laplacian support vector machine (LapSVM), has been widely used in semi-supervised learning, and its performance greatly depends on the choice of some hyper-parameters. Cross-validation (CV) is the most popular approach for selecting the optimal hyper-parameters, but it has high complexity due to multiple times of learner training. In this paper, we provide a method to approximate the CV for manifold regularization based on a notion of robust statistics, called Bouligand influence function (BIF). We first provide a strategy for approximating the CV via the Taylor expansion of BIF. Then, we show how to calculate the BIF for general loss function,and further give the approximate CV criteria for model selection in manifold regularization. The proposed approximate CV for manifold regularization requires training only once, hence can significantly improve the efficiency of traditional CV. Experimental results show that our approximate CV has no statistical discrepancy with the original one, but much smaller time cost.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes