LG AI MLJun 13, 2023

Identification of Nonlinear Latent Hierarchical Models

Lingjing Kong, Biwei Huang, Feng Xie, Eric Xing, Yuejie Chi, Kun Zhang

arXiv:2306.07916v221.130 citationsh-index: 110

Originality Highly original

AI Analysis

This work addresses a fundamental challenge in causal inference for applications like biological and medical data, offering a novel theoretical guarantee for identifiability in complex nonlinear settings.

The paper tackles the problem of identifying latent variables and causal structures from observational data in nonlinear latent hierarchical models, showing that identifiability can be achieved under mild assumptions such as allowing multiple paths and general nonlinearity, with an explicit estimation procedure constructed for asymptotic identification.

Identifying latent variables and causal structures from observational data is essential to many real-world applications involving biological data, medical data, and unstructured data such as images and languages. However, this task can be highly challenging, especially when observed variables are generated by causally related latent variables and the relationships are nonlinear. In this work, we investigate the identification problem for nonlinear latent hierarchical causal models in which observed variables are generated by a set of causally related latent variables, and some latent variables may not have observed children. We show that the identifiability of causal structures and latent variables (up to invertible transformations) can be achieved under mild assumptions: on causal structures, we allow for multiple paths between any pair of variables in the graph, which relaxes latent tree assumptions in prior work; on structural functions, we permit general nonlinearity and multi-dimensional continuous variables, alleviating existing work's parametric assumptions. Specifically, we first develop an identification criterion in the form of novel identifiability guarantees for an elementary latent variable model. Leveraging this criterion, we show that both causal structures and latent variables of the hierarchical model can be identified asymptotically by explicitly constructing an estimation procedure. To the best of our knowledge, our work is the first to establish identifiability guarantees for both causal structures and latent variables in nonlinear latent hierarchical models.

View on arXiv PDF

Similar