ML LGJul 9, 2019

Conditional Independence Testing using Generative Adversarial Networks

arXiv:1907.04068v217.966 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of detecting conditional dependence for researchers in statistics and machine learning, particularly in high-dimensional settings like genetics, though it appears incremental as it builds on existing GAN methods for hypothesis testing.

The paper tackles the problem of conditional independence testing in high-dimensional feature spaces by introducing a new test statistic based on generative adversarial networks, which approximates a conditional distribution to maximize power while controlling type I error without distributional assumptions, and demonstrates significant power gains in synthetic simulations and application to genetic data.

We consider the hypothesis testing problem of detecting conditional dependence, with a focus on high-dimensional feature spaces. Our contribution is a new test statistic based on samples from a generative adversarial network designed to approximate directly a conditional distribution that encodes the null hypothesis, in a manner that maximizes power (the rate of true negatives). We show that such an approach requires only that density approximation be viable in order to ensure that we control type I error (the rate of false positives); in particular, no assumptions need to be made on the form of the distributions or feature dependencies. Using synthetic simulations with high-dimensional data we demonstrate significant gains in power over competing methods. In addition, we illustrate the use of our test to discover causal markers of disease in genetic data.

View on arXiv PDF Code

Similar