LG MLNov 3, 2023

Structured Neural Networks for Density Estimation and Causal Inference

Asic Q. Chen, Ruian Shi, Xiang Gao, Ricardo Baptista, Rahul G. Krishnan

arXiv:2311.02221v112.312 citationsh-index: 5

Originality Highly original

AI Analysis

This work addresses the need for structured neural networks to improve data efficiency and enable causal inference in machine learning, representing a novel method for a known bottleneck.

The paper tackles the problem of learning neural networks that respect conditional independence structures, such as those in Bayesian networks, by proposing Structured Neural Networks (StrNN) that use masking pathways designed via binary matrix factorization. It demonstrates utility in density estimation and causal inference, achieving data-efficient generative modeling and enabling normalizing flows for causal effect estimation.

Injecting structure into neural networks enables learning functions that satisfy invariances with respect to subsets of inputs. For instance, when learning generative models using neural networks, it is advantageous to encode the conditional independence structure of observed variables, often in the form of Bayesian networks. We propose the Structured Neural Network (StrNN), which injects structure through masking pathways in a neural network. The masks are designed via a novel relationship we explore between neural network architectures and binary matrix factorization, to ensure that the desired independencies are respected. We devise and study practical algorithms for this otherwise NP-hard design problem based on novel objectives that control the model architecture. We demonstrate the utility of StrNN in three applications: (1) binary and Gaussian density estimation with StrNN, (2) real-valued density estimation with Structured Autoregressive Flows (StrAFs) and Structured Continuous Normalizing Flows (StrCNF), and (3) interventional and counterfactual analysis with StrAFs for causal inference. Our work opens up new avenues for learning neural networks that enable data-efficient generative modeling and the use of normalizing flows for causal effect estimation.

View on arXiv PDF

Similar