Learning local neighborhoods of non-Gaussian graphical models: A measure transport approach
This work addresses the scalability and flexibility limitations in graphical model learning for statisticians and data scientists, offering an incremental improvement over existing methods like Lasso-based neighborhood selection.
The paper tackled the problem of identifying conditional independence relationships in high-dimensional non-Gaussian graphical models by proposing the L-SING algorithm, which uses transport maps for scalable local neighborhood estimation and demonstrated effectiveness in Gaussian and non-Gaussian settings, including a biological dataset with over 150 variables.
Identifying the Markov properties or conditional independencies of a collection of random variables is a fundamental task in statistics for modeling and inference. Existing approaches often learn the structure of a probabilistic graphical model, which encodes these dependencies, by assuming that the variables follow a distribution with a simple parametric form. Moreover, the computational cost of many algorithms scales poorly for high-dimensional distributions, as they need to estimate all the edges in the graph simultaneously. In this work, we propose a scalable algorithm to infer the conditional independence relationships of each variable by exploiting the local Markov property. The proposed method, named Localized Sparsity Identification for Non-Gaussian Distributions (L-SING), estimates the graph by using flexible classes of transport maps to represent the conditional distribution for each variable. We show that L-SING includes existing approaches, such as neighborhood selection with Lasso, as a special case. We demonstrate the effectiveness of our algorithm in both Gaussian and non-Gaussian settings by comparing it to existing methods. Lastly, we show the scalability of the proposed approach by applying it to high-dimensional non-Gaussian examples, including a biological dataset with more than 150 variables.