NOTMAD: Estimating Bayesian Networks with Sample-Specific Structures and Parameters
This work addresses the problem of inferring sample-specific Bayesian networks for researchers in fields like bioinformatics, offering a novel approach to overcome limitations in statistical power and resolution, though it appears incremental as it builds on existing methods like NOTEARS.
The authors tackled the challenge of estimating context-specific Bayesian networks without breaking datasets into subsamples, which limits statistical power and resolution, by proposing NOTMAD, a method that models these networks as mixtures of archetypal DAGs learned jointly, enabling estimation at single-sample resolution and demonstrating utility in patient-specific gene expression networks for cancer analysis.
Context-specific Bayesian networks (i.e. directed acyclic graphs, DAGs) identify context-dependent relationships between variables, but the non-convexity induced by the acyclicity requirement makes it difficult to share information between context-specific estimators (e.g. with graph generator functions). For this reason, existing methods for inferring context-specific Bayesian networks have favored breaking datasets into subsamples, limiting statistical power and resolution, and preventing the use of multidimensional and latent contexts. To overcome this challenge, we propose NOTEARS-optimized Mixtures of Archetypal DAGs (NOTMAD). NOTMAD models context-specific Bayesian networks as the output of a function which learns to mix archetypal networks according to sample context. The archetypal networks are estimated jointly with the context-specific networks and do not require any prior knowledge. We encode the acyclicity constraint as a smooth regularization loss which is back-propagated to the mixing function; in this way, NOTMAD shares information between context-specific acyclic graphs, enabling the estimation of Bayesian network structures and parameters at even single-sample resolution. We demonstrate the utility of NOTMAD and sample-specific network inference through analysis and experiments, including patient-specific gene expression networks which correspond to morphological variation in cancer.