MLLGJun 11, 2020

Improved Design of Quadratic Discriminant Analysis Classifier in Unbalanced Settings

arXiv:2006.06355v3
AI Analysis

This addresses the issue of classification accuracy in unbalanced data for users of statistical learning methods, but it is incremental as it builds on existing regularized QDA.

The paper tackles the problem of quadratic discriminant analysis (QDA) classifiers performing poorly in unbalanced data settings, where they can assign all observations to the same class, by proposing an improved regularized QDA with two regularization parameters and a modified bias, resulting in significantly better performance on real and synthetic datasets.

The use of quadratic discriminant analysis (QDA) or its regularized version (R-QDA) for classification is often not recommended, due to its well-acknowledged high sensitivity to the estimation noise of the covariance matrix. This becomes all the more the case in unbalanced data settings for which it has been found that R-QDA becomes equivalent to the classifier that assigns all observations to the same class. In this paper, we propose an improved R-QDA that is based on the use of two regularization parameters and a modified bias, properly chosen to avoid inappropriate behaviors of R-QDA in unbalanced settings and to ensure the best possible classification performance. The design of the proposed classifier builds on a refined asymptotic analysis of its performance when the number of samples and that of features grow large simultaneously, which allows to cope efficiently with the high-dimensionality frequently met within the big data paradigm. The performance of the proposed classifier is assessed on both real and synthetic data sets and was shown to be much better than what one would expect from a traditional R-QDA.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes