DSApr 26
Connectivity-Preserving Important Separators: A Framework for Cut-Uncut ProblemsBatya Kenig
Graph separation is a central tool in parameterized algorithm design, and important separators are among its most successful ingredients. They yield small, structured families of separators that can be enumerated efficiently, and underlie fixed-parameter algorithms for many problems. However, this framework fundamentally breaks down in cut-uncut settings, where one must separate terminal sets while preserving connectivity inside specified groups of terminals. In such problems, the classical reachability-based notion of importance no longer captures the separators that matter. We introduce connectivity-preserving important separators, a new framework for cut problems with connectivity constraints. Our main result shows that this family is highly structured: the number of connectivity-preserving important separators of size at most $k$ is $2^{O(k \log k)}$, and they can be enumerated within the same bound up to polynomial factors. As an application, we obtain improved fixed-parameter algorithms for Node Multiway Cut-Uncut. In particular, when the number of equivalence classes is constant - including 2-Sets Cut-Uncut - our approach yields a $2^{O(k \log k)}$ running time, improving on the previous $2^{O(k^2 \log k)}$ dependence. More broadly, our results show that separator-based methods can be extended from pure disconnection problems to problems that simultaneously require separation and preservation of connectivity.
AIOct 21, 2023
Approximate Implication for Probabilistic Graphical ModelsBatya Kenig
The graphical structure of Probabilistic Graphical Models (PGMs) represents the conditional independence (CI) relations that hold in the modeled distribution. Every separator in the graph represents a conditional independence relation in the distribution, making them the vehicle through which new conditional independencies are inferred and verified. The notion of separation in graphs depends on whether the graph is directed (i.e., a Bayesian Network), or undirected (i.e., a Markov Network). The premise of all current systems-of-inference for deriving CIs in PGMs, is that the set of CIs used for the construction of the PGM hold exactly. In practice, algorithms for extracting the structure of PGMs from data discover approximate CIs that do not hold exactly in the distribution. In this paper, we ask how the error in this set propagates to the inferred CIs read off the graphical structure. More precisely, what guarantee can we provide on the inferred CI when the set of CIs that entailed it hold only approximately? It has recently been shown that in the general case, no such guarantee can be provided. In this work, we prove new negative and positive results concerning this problem. We prove that separators in undirected PGMs do not necessarily represent approximate CIs. That is, no guarantee can be provided for CIs inferred from the structure of undirected graphs. We prove that such a guarantee exists for the set of CIs inferred in directed graphical models, making the $d$-separation algorithm a sound and complete system for inferring approximate CIs. We also establish improved approximation guarantees for independence relations derived from marginal and saturated CIs.
LGApr 21, 2025
Causal DAG Summarization (Full Version)Anna Zeng, Michael Cafarella, Batya Kenig et al.
Causal inference aids researchers in discovering cause-and-effect relationships, leading to scientific insights. Accurate causal estimation requires identifying confounding variables to avoid false discoveries. Pearl's causal model uses causal DAGs to identify confounding variables, but incorrect DAGs can lead to unreliable causal conclusions. However, for high dimensional data, the causal DAGs are often complex beyond human verifiability. Graph summarization is a logical next step, but current methods for general-purpose graph summarization are inadequate for causal DAG summarization. This paper addresses these challenges by proposing a causal graph summarization objective that balances graph simplification for better understanding while retaining essential causal information for reliable inference. We develop an efficient greedy algorithm and show that summary causal DAGs can be directly used for inference and are more robust to misspecification of assumptions, enhancing robustness for causal inference. Experimenting with six real-life datasets, we compared our algorithm to three existing solutions, showing its effectiveness in handling high-dimensional data and its ability to generate summary DAGs that ensure both reliable causal inference and robustness against misspecifications.
AIMay 30, 2021
Approximate Implication with d-SeparationBatya Kenig
The graphical structure of Probabilistic Graphical Models (PGMs) encodes the conditional independence (CI) relations that hold in the modeled distribution. Graph algorithms, such as d-separation, use this structure to infer additional conditional independencies, and to query whether a specific CI holds in the distribution. The premise of all current systems-of-inference for deriving CIs in PGMs, is that the set of CIs used for the construction of the PGM hold exactly. In practice, algorithms for extracting the structure of PGMs from data, discover approximate CIs that do not hold exactly in the distribution. In this paper, we ask how the error in this set propagates to the inferred CIs read off the graphical structure. More precisely, what guarantee can we provide on the inferred CI when the set of CIs that entailed it hold only approximately? It has recently been shown that in the general case, no such guarantee can be provided. We prove that such a guarantee exists for the set of CIs inferred in directed graphical models, making the d-separation algorithm a sound and complete system for inferring approximate CIs. We also prove an approximation guarantee for independence relations derived from marginal CIs.