ML LGApr 28, 2017

Learning Quadratic Variance Function (QVF) DAG models via OverDispersion Scoring (ODS)

arXiv:1704.08783v19.048 citations

Originality Highly original

AI Analysis

This work addresses challenges in causal inference for large-scale data, offering a novel approach for learning DAG models with signal-dependent noise, which is incremental in improving identifiability and computational efficiency.

The paper tackles the problem of learning large-scale directed acyclic graph (DAG) models for causal inference by introducing a new class of identifiable models where variance is a quadratic function of the mean (QVF DAG models), and proposes the OverDispersion Scoring (ODS) algorithm, which is shown to be statistically consistent in high-dimensional settings and performs well compared to state-of-the-art methods.

Learning DAG or Bayesian network models is an important problem in multi-variate causal inference. However, a number of challenges arises in learning large-scale DAG models including model identifiability and computational complexity since the space of directed graphs is huge. In this paper, we address these issues in a number of steps for a broad class of DAG models where the noise or variance is signal-dependent. Firstly we introduce a new class of identifiable DAG models, where each node has a distribution where the variance is a quadratic function of the mean (QVF DAG models). Our QVF DAG models include many interesting classes of distributions such as Poisson, Binomial, Geometric, Exponential, Gamma and many other distributions in which the noise variance depends on the mean. We prove that this class of QVF DAG models is identifiable, and introduce a new algorithm, the OverDispersion Scoring (ODS) algorithm, for learning large-scale QVF DAG models. Our algorithm is based on firstly learning the moralized or undirected graphical model representation of the DAG to reduce the DAG search-space, and then exploiting the quadratic variance property to learn the causal ordering. We show through theoretical results and simulations that our algorithm is statistically consistent in the high-dimensional p>n setting provided that the degree of the moralized graph is bounded and performs well compared to state-of-the-art DAG-learning algorithms.

View on arXiv PDF

Similar