ML LGDec 16, 2020

A connection between the pattern classification problem and the General Linear Model for statistical inference

Juan Manuel Gorriz, SIPBA group, John Suckling

arXiv:2012.08903v12.78 citationsh-index: 63

Originality Incremental advance

AI Analysis

This work provides a theoretical link between classical statistical inference and machine learning, potentially offering new statistical testing methods for researchers using predictive algorithms, representing an incremental advancement in statistical methodology.

This paper establishes a connection between the General Linear Model (GLM) and machine learning (MLE)-based inference, specifically linking GLM parameter estimation to a Linear Regression Model (LRM) of an indicator matrix. It then derives a statistical test based on Support Vector Machines (SVM) within a permutation analysis, demonstrating that parameter estimations from different models lead to varying classification performances and that the proposed predictive algorithms offer a good trade-off between type I error and statistical power on real data.

A connection between the General Linear Model (GLM) in combination with classical statistical inference and the machine learning (MLE)-based inference is described in this paper. Firstly, the estimation of the GLM parameters is expressed as a Linear Regression Model (LRM) of an indicator matrix, that is, in terms of the inverse problem of regressing the observations. In other words, both approaches, i.e. GLM and LRM, apply to different domains, the observation and the label domains, and are linked by a normalization value at the least-squares solution. Subsequently, from this relationship we derive a statistical test based on a more refined predictive algorithm, i.e. the (non)linear Support Vector Machine (SVM) that maximizes the class margin of separation, within a permutation analysis. The MLE-based inference employs a residual score and includes the upper bound to compute a better estimation of the actual (real) error. Experimental results demonstrate how the parameter estimations derived from each model resulted in different classification performances in the equivalent inverse problem. Moreover, using real data the aforementioned predictive algorithms within permutation tests, including such model-free estimators, are able to provide a good trade-off between type I error and statistical power.

View on arXiv PDF

Similar