Deception by Omission: Using Adversarial Missingness to Poison Causal Structure Learning
This addresses a security vulnerability in causal machine learning for scenarios where data auditing prevents traditional adversarial perturbations, though it is incremental as it extends prior work on adversarial attacks to missing data contexts.
The paper tackles the problem of adversarial attacks on causal structure learning by introducing a novel method where adversaries omit portions of true training data to bias learned causal models, demonstrating effectiveness in deceiving popular algorithms on real and synthetic datasets.
Inference of causal structures from observational data is a key component of causal machine learning; in practice, this data may be incompletely observed. Prior work has demonstrated that adversarial perturbations of completely observed training data may be used to force the learning of inaccurate causal structural models (SCMs). However, when the data can be audited for correctness (e.g., it is crytographically signed by its source), this adversarial mechanism is invalidated. This work introduces a novel attack methodology wherein the adversary deceptively omits a portion of the true training data to bias the learned causal structures in a desired manner. Theoretically sound attack mechanisms are derived for the case of arbitrary SCMs, and a sample-efficient learning-based heuristic is given for Gaussian SCMs. Experimental validation of these approaches on real and synthetic data sets demonstrates the effectiveness of adversarial missingness attacks at deceiving popular causal structure learning algorithms.