MLCOOct 7, 2014

PAC-Bayesian AUC classification and scoring

arXiv:1410.1771v222 citations
Originality Incremental advance
AI Analysis

This work addresses classification and feature selection challenges in machine learning, offering a theoretically grounded method with practical computational tools, though it appears incremental as it builds on existing PAC-Bayesian and AUC frameworks.

The paper tackles the problem of classification and scoring using the AUC criterion by developing a PAC-Bayesian approach with linear and non-linear score functions, resulting in non-asymptotic bounds for Gaussian and spike-and-slab priors and efficient computational algorithms like Sequential Monte Carlo and Expectation-Propagation.

We develop a scoring and classification procedure based on the PAC-Bayesian approach and the AUC (Area Under Curve) criterion. We focus initially on the class of linear score functions. We derive PAC-Bayesian non-asymptotic bounds for two types of prior for the score parameters: a Gaussian prior, and a spike-and-slab prior; the latter makes it possible to perform feature selection. One important advantage of our approach is that it is amenable to powerful Bayesian computational tools. We derive in particular a Sequential Monte Carlo algorithm, as an efficient method which may be used as a gold standard, and an Expectation-Propagation algorithm, as a much faster but approximate method. We also extend our method to a class of non-linear score functions, essentially leading to a nonparametric procedure, by considering a Gaussian process prior.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes