ME LG MLAug 15, 2022

Predictive Data Calibration for Linear Correlation Significance Testing

Kaustubh R. Patil, Simon B. Eickhoff, Robert Langner

arXiv:2208.07081v11.23 citationsh-index: 129

Originality Incremental advance

AI Analysis

This addresses a fundamental issue in statistical analysis for researchers and practitioners who rely on correlation testing, offering a potential improvement over standard methods, though it appears incremental as it builds on existing calibration concepts.

The paper tackles the problem of Pearson's correlation coefficient (PCC) being unreliable for measuring linear relationships due to issues like limited sample size and nonnormality, leading to Type I errors in significance testing, especially with multiple hypotheses. It proposes a predictive data calibration method that conditions data on expected linear relationships, resulting in calibrated p-values interpretable as posterior probabilities and calibrated r estimates, with empirical evidence from simulations and real-world data.

Inferring linear relationships lies at the heart of many empirical investigations. A measure of linear dependence should correctly evaluate the strength of the relationship as well as qualify whether it is meaningful for the population. Pearson's correlation coefficient (PCC), the \textit{de-facto} measure for bivariate relationships, is known to lack in both regards. The estimated strength $r$ maybe wrong due to limited sample size, and nonnormality of data. In the context of statistical significance testing, erroneous interpretation of a $p$-value as posterior probability leads to Type I errors -- a general issue with significance testing that extends to PCC. Such errors are exacerbated when testing multiple hypotheses simultaneously. To tackle these issues, we propose a machine-learning-based predictive data calibration method which essentially conditions the data samples on the expected linear relationship. Calculating PCC using calibrated data yields a calibrated $p$-value that can be interpreted as posterior probability together with a calibrated $r$ estimate, a desired outcome not provided by other methods. Furthermore, the ensuing independent interpretation of each test might eliminate the need for multiple testing correction. We provide empirical evidence favouring the proposed method using several simulations and application to real-world data.

View on arXiv PDF

Similar