MLLGCOJan 23, 2019

Large dimensional analysis of general margin based classification methods

arXiv:1901.08057v29 citations
AI Analysis

This work addresses the problem of classifier selection and parameter tuning for practitioners in machine learning and statistics, offering theoretical insights but is incremental in nature.

The paper investigates the asymptotic performance of a broad family of large-margin classifiers under high-dimensional settings to determine which classifier is best for a given task, finding that analytical results closely match simulations and provide guidance for selection and tuning.

Margin-based classifiers have been popular in both machine learning and statistics for classification problems. Since a large number of classifiers are available, one natural question is which type of classifiers should be used given a particular classification task. We answer this question by investigating the asymptotic performance of a family of large-margin classifiers under the two component mixture models in situations where the data dimension $p$ and the sample $n$ are both large. This family covers a broad range of classifiers including support vector machine, distance weighted discrimination, penalized logistic regression, and large-margin unified machine as special cases. The asymptotic results are described by a set of nonlinear equations and we observe a close match of them with Monte Carlo simulation on finite data samples. Our analytical studies shed new light on how to select the best classifier among various classification methods as well as on how to choose the optimal tuning parameters for a given method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes