LGJun 29, 2025

When Additive Noise Meets Unobserved Mediators: Bivariate Denoising Diffusion for Causal Discovery

arXiv:2506.23374v11 citationsh-index: 17
Originality Incremental advance
AI Analysis

This addresses a practical limitation in causal discovery for researchers and practitioners when hidden variables are present, though it appears to be an incremental improvement over existing additive noise model frameworks.

The paper tackles the problem of bivariate causal discovery when unobserved mediators corrupt relationships, showing that standard additive noise models fail in such settings. The proposed Bivariate Denoising Diffusion method outperforms existing approaches in mediator-corrupted scenarios while maintaining strong performance in standard settings.

Distinguishing cause and effect from bivariate observational data is a foundational problem in many disciplines, but challenging without additional assumptions. Additive noise models (ANMs) are widely used to enable sample-efficient bivariate causal discovery. However, conventional ANM-based methods fail when unobserved mediators corrupt the causal relationship between variables. This paper makes three key contributions: first, we rigorously characterize why standard ANM approaches break down in the presence of unmeasured mediators. Second, we demonstrate that prior solutions for hidden mediation are brittle in finite sample settings, limiting their practical utility. To address these gaps, we propose Bivariate Denoising Diffusion (BiDD) for causal discovery, a method designed to handle latent noise introduced by unmeasured mediators. Unlike prior methods that infer directionality through mean squared error loss comparisons, our approach introduces a novel independence test statistic: during the noising and denoising processes for each variable, we condition on the other variable as input and evaluate the independence of the predicted noise relative to this input. We prove asymptotic consistency of BiDD under the ANM, and conjecture that it performs well under hidden mediation. Experiments on synthetic and real-world data demonstrate consistent performance, outperforming existing methods in mediator-corrupted settings while maintaining strong performance in mediator-free settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes