LGAICVMLJan 1, 2023

Causal Deep Learning

arXiv:2301.00314v43 citationsh-index: 16
Originality Incremental advance
AI Analysis

This work addresses causal inference for researchers in machine learning, offering a novel method that integrates causal reasoning with deep learning, though it appears incremental as it builds on existing tensor analysis frameworks.

The authors tackled the problem of causal inference in deep learning by deriving causal deep neural networks from tensor factor analysis, enabling both forward and inverse causal questions with scalable architectures. They demonstrated the approach using facial images and described various computation strategies.

We derive a set of causal deep neural networks whose architectures are a consequence of tensor (multilinear) factor analysis, a framework that facilitates causal inference. Forward causal questions are addressed with a neural network architecture composed of causal capsules and a tensor transformer. Causal capsules compute a set of invariant causal factor representations, whose interactions are governed by a tensor transformation. Inverse causal questions are addressed with a neural network that implements the multilinear projection algorithm. The architecture reverses the order of operations of a forward neural network and estimates the causes of effects. As an alternative to aggressive bottleneck dimension reduction or regularized regression that may camouflage an inherently underdetermined inverse problem, we prescribe modeling different aspects of the mechanism of data formation with piecewise tensor models whose multilinear projections produce multiple candidate solutions. Our forward and inverse questions may be addressed with shallow architectures, but for computationally scalable solutions, we derive a set of deep neural networks by taking advantage of block algebra. An interleaved kernel hierarchy results in doubly non-linear tensor factor models. The causal neural networks that are a consequence of tensor factor analysis are data agnostic, but are illustrated with facial images. Sequential, parallel and asynchronous parallel computation strategies are described.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes