LG MLAug 11, 2023

Learning nonparametric DAGs with incremental information via high-order HSIC

arXiv:2308.05969v2h-index: 3Has Code

Originality Incremental advance

AI Analysis

This work addresses a specific bottleneck in Bayesian network learning for researchers and practitioners, offering an incremental improvement over existing methods.

The paper tackles the problem of learning directed acyclic graphs (DAGs) when local variables have both direct and indirect dependencies, which causes global score-based methods to miss edges. It introduces a two-phase algorithm that uses high-order HSIC to locally amend global optimization, resulting in improved performance, such as reducing the structure intervention distance by 329.7 compared to CAM on a 40-node graph.

Score-based methods for learning Bayesain networks(BN) aim to maximizing the global score functions. However, if local variables have direct and indirect dependence simultaneously, the global optimization on score functions misses edges between variables with indirect dependent relationship, of which scores are smaller than those with direct dependent relationship. In this paper, we present an identifiability condition based on a determined subset of parents to identify the underlying DAG. By the identifiability condition, we develop a two-phase algorithm namely optimal-tuning (OT) algorithm to locally amend the global optimization. In the optimal phase, an optimization problem based on first-order Hilbert-Schmidt independence criterion (HSIC) gives an estimated skeleton as the initial determined parents subset. In the tuning phase, the skeleton is locally tuned by deletion, addition and DAG-formalization strategies using the theoretically proved incremental properties of high-order HSIC. Numerical experiments for different synthetic datasets and real-world datasets show that the OT algorithm outperforms existing methods. Especially in Sigmoid Mix model with the size of the graph being ${\rm\bf d=40}$, the structure intervention distance (SID) of the OT algorithm is 329.7 smaller than the one obtained by CAM, which indicates that the graph estimated by the OT algorithm misses fewer edges compared with CAM.Source code of the OT algorithm is available at https://github.com/YafeiannWang/optimal-tune-algorithm.

View on arXiv PDF Code

Similar