ML LG ST MENov 16, 2017

Predictive Independence Testing, Predictive Conditional Independence Testing, and Predictive Graphical Modelling

arXiv:1711.05869v24.88 citationsHas Code

Originality Incremental advance

AI Analysis

This provides a scalable and generalizable solution for statisticians and data scientists dealing with inference and modelling, though it is incremental in applying existing supervised learning advances.

The paper tackles the lack of a practical workflow for testing multivariate and conditional independence by linking it to supervised learning, enabling automated tuning and scalable methods; it shows that their predictive independence tests outperform or match current practice and recover true graphical models asymptotically.

Testing (conditional) independence of multivariate random variables is a task central to statistical inference and modelling in general - though unfortunately one for which to date there does not exist a practicable workflow. State-of-art workflows suffer from the need for heuristic or subjective manual choices, high computational complexity, or strong parametric assumptions. We address these problems by establishing a theoretical link between multivariate/conditional independence testing, and model comparison in the multivariate predictive modelling aka supervised learning task. This link allows advances in the extensively studied supervised learning workflow to be directly transferred to independence testing workflows - including automated tuning of machine learning type which addresses the need for a heuristic choice, the ability to quantitatively trade-off computational demand with accuracy, and the modern black-box philosophy for checking and interfacing. As a practical implementation of this link between the two workflows, we present a python package 'pcit', which implements our novel multivariate and conditional independence tests, interfacing the supervised learning API of the scikit-learn package. Theory and package also allow for straightforward independence test based learning of graphical model structure. We empirically show that our proposed predictive independence test outperform or are on par to current practice, and the derived graphical model structure learning algorithms asymptotically recover the 'true' graph. This paper, and the 'pcit' package accompanying it, thus provide powerful, scalable, generalizable, and easy-to-use methods for multivariate and conditional independence testing, as well as for graphical model structure learning.

View on arXiv PDF Code

Similar