STLGMLApr 7, 2012

Density-sensitive semisupervised inference

arXiv:1204.1685v235 citations
AI Analysis

This provides a theoretical basis for semisupervised inference, addressing a foundational gap in machine learning, though it is incremental as it builds on existing assumptions.

The authors tackled the lack of theoretical foundations in semisupervised learning by developing a minimax framework to analyze methods that use distribution-sensitive metrics, with a parameter α to control assumption strength and adapt it from data.

Semisupervised methods are techniques for using labeled data $(X_1,Y_1),\ldots,(X_n,Y_n)$ together with unlabeled data $X_{n+1},\ldots,X_N$ to make predictions. These methods invoke some assumptions that link the marginal distribution $P_X$ of X to the regression function f(x). For example, it is common to assume that f is very smooth over high density regions of $P_X$. Many of the methods are ad-hoc and have been shown to work in specific examples but are lacking a theoretical foundation. We provide a minimax framework for analyzing semisupervised methods. In particular, we study methods based on metrics that are sensitive to the distribution $P_X$. Our model includes a parameter $α$ that controls the strength of the semisupervised assumption. We then use the data to adapt to $α$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes