LG SP MLOct 17, 2023

Minimally Informed Linear Discriminant Analysis: training an LDA model with unlabelled data

Nicolas Heintz, Tom Francart, Alexander Bertrand

arXiv:2310.11110v12.03 citationsh-index: 35

Originality Incremental advance

AI Analysis

This provides a method for semi-supervised classification, reducing labeling effort, but is incremental as it builds on the well-established LDA framework.

The paper tackles the problem of training Linear Discriminant Analysis (LDA) models without labeled data by showing that the exact projection vector can be computed using minimal prior information, such as class averages or covariance matrices, and demonstrates in experiments that this minimally informed LDA (MILDA) closely matches supervised LDA performance.

Linear Discriminant Analysis (LDA) is one of the oldest and most popular linear methods for supervised classification problems. In this paper, we demonstrate that it is possible to compute the exact projection vector from LDA models based on unlabelled data, if some minimal prior information is available. More precisely, we show that only one of the following three pieces of information is actually sufficient to compute the LDA projection vector if only unlabelled data are available: (1) the class average of one of the two classes, (2) the difference between both class averages (up to a scaling), or (3) the class covariance matrices (up to a scaling). These theoretical results are validated in numerical experiments, demonstrating that this minimally informed Linear Discriminant Analysis (MILDA) model closely matches the performance of a supervised LDA model. Furthermore, we show that the MILDA projection vector can be computed in a closed form with a computational cost comparable to LDA and is able to quickly adapt to non-stationary data, making it well-suited to use as an adaptive classifier.

View on arXiv PDF

Similar