AIJul 3, 2020
On Symbolically Encoding the Behavior of Random ForestsArthur Choi, Andy Shih, Anchal Goyanka et al.
Recent work has shown that the input-output behavior of some machine learning systems can be captured symbolically using Boolean expressions or tractable Boolean circuits, which facilitates reasoning about the behavior of these systems. While most of the focus has been on systems with Boolean inputs and outputs, we address systems with discrete inputs and outputs, including ones with discretized continuous variables as in systems based on decision trees. We also focus on the suitability of encodings for computing prime implicants, which have recently played a central role in explaining the decisions of machine learning systems. We show some key distinctions with encodings for satisfiability, and propose an encoding that is sound and complete for the given task.
AIJun 12, 2020
A New Perspective on Learning Context-Specific IndependenceYujia Shen, Arthur Choi, Adnan Darwiche
Local structure such as context-specific independence (CSI) has received much attention in the probabilistic graphical model (PGM) literature, as it facilitates the modeling of large complex systems, as well as for reasoning with them. In this paper, we provide a new perspective on how to learn CSIs from data. We propose to first learn a functional and parameterized representation of a conditional probability table (CPT), such as a neural network. Next, we quantize this continuous function, into an arithmetic circuit representation that facilitates efficient inference. In the first step, we can leverage the many powerful tools that have been developed in the machine learning literature. In the second step, we exploit more recently-developed analytic tools from explainable AI, for the purposes of learning CSIs. Finally, we contrast our approach, empirically and conceptually, with more traditional variable-splitting approaches, that search for CSIs more explicitly.
LGApr 5, 2020
On Tractable Representations of Binary Neural NetworksWeijia Shi, Andy Shih, Adnan Darwiche et al.
We consider the compilation of a binary neural network's decision function into tractable representations such as Ordered Binary Decision Diagrams (OBDDs) and Sentential Decision Diagrams (SDDs). Obtaining this function as an OBDD/SDD facilitates the explanation and formal verification of a neural network's behavior. First, we consider the task of verifying the robustness of a neural network, and show how we can compute the expected robustness of a neural network, given an OBDD/SDD representation of it. Next, we consider a more efficient approach for compiling neural networks, based on a pseudo-polynomial time algorithm for compiling a neuron. We then provide a case study in a handwritten digits dataset, highlighting how two neural networks trained from the same dataset can have very high accuracies, yet have very different levels of robustness. Finally, in experiments, we show that it is feasible to obtain compact representations of neural networks as SDDs.
AIDec 21, 2018
On the Relative Expressiveness of Bayesian and Neural NetworksArthur Choi, Ruocheng Wang, Adnan Darwiche
A neural network computes a function. A central property of neural networks is that they are "universal approximators:" for a given continuous function, there exists a neural network that can approximate it arbitrarily well, given enough neurons (and some additional assumptions). In contrast, a Bayesian network is a model, but each of its queries can be viewed as computing a function. In this paper, we identify some key distinctions between the functions computed by neural networks and those by marginal Bayesian network queries, showing that the former are more expressive than the latter. Moreover, we propose a simple augmentation to Bayesian networks (a testing operator), which enables their marginal queries to become "universal approximators."
AIMay 9, 2018
A Symbolic Approach to Explaining Bayesian Network ClassifiersAndy Shih, Arthur Choi, Adnan Darwiche
We propose an approach for explaining Bayesian network classifiers, which is based on compiling such classifiers into decision functions that have a tractable and symbolic form. We introduce two types of explanations for why a classifier may have classified an instance positively or negatively and suggest algorithms for computing these explanations. The first type of explanation identifies a minimal set of the currently active features that is responsible for the current classification, while the second type of explanation identifies a minimal set of features whose current state (active or not) is sufficient for the classification. We consider in particular the compilation of Naive and Latent-Tree Bayesian network classifiers into Ordered Decision Diagrams (ODDs), providing a context for evaluating our proposal using case studies and experiments based on classifiers from the literature.
AIAug 22, 2017
On Relaxing Determinism in Arithmetic CircuitsArthur Choi, Adnan Darwiche
The past decade has seen a significant interest in learning tractable probabilistic representations. Arithmetic circuits (ACs) were among the first proposed tractable representations, with some subsequent representations being instances of ACs with weaker or stronger properties. In this paper, we provide a formal basis under which variants on ACs can be compared, and where the precise roles and semantics of their various properties can be made more transparent. This allows us to place some recent developments on ACs in a clearer perspective and to also derive new results for ACs. This includes an exponential separation between ACs with and without determinism; completeness and incompleteness results; and tractability results (or lack thereof) when computing most probable explanations (MPEs).
AIApr 5, 2015
Dual Decomposition from the Perspective of Relax, Compensate and then RecoverArthur Choi, Adnan Darwiche
Relax, Compensate and then Recover (RCR) is a paradigm for approximate inference in probabilistic graphical models that has previously provided theoretical and practical insights on iterative belief propagation and some of its generalizations. In this paper, we characterize the technique of dual decomposition in the terms of RCR, viewing it as a specific way to compensate for relaxed equivalence constraints. Among other insights gathered from this perspective, we propose novel heuristics for recovering relaxed equivalence constraints with the goal of incrementally tightening dual decomposition approximations, all the way to reaching exact solutions. We also show empirically that recovering equivalence constraints can sometimes tighten the corresponding approximation (and obtaining exact results), without increasing much the complexity of inference.
LGNov 25, 2014
Efficient Algorithms for Bayesian Network Parameter Learning from Incomplete DataGuy Van den Broeck, Karthika Mohan, Arthur Choi et al.
We propose an efficient family of algorithms to learn the parameters of a Bayesian network from incomplete data. In contrast to textbook approaches such as EM and the gradient method, our approach is non-iterative, yields closed form parameter estimates, and eliminates the need for inference in a Bayesian network. Our approach provides consistent parameter estimates for missing data problems that are MCAR, MAR, and in some cases, MNAR. Empirically, our approach is orders of magnitude faster than EM (as our approach requires no inference). Given sufficient data, we learn parameters that can be orders of magnitude more accurate.
AIOct 16, 2012
New Advances and Theoretical Insights into EDMLKhaled S. Refaat, Arthur Choi, Adnan Darwiche
EDML is a recently proposed algorithm for learning MAP parameters in Bayesian networks. In this paper, we present a number of new advances and insights on the EDML algorithm. First, we provide the multivalued extension of EDML, originally proposed for Bayesian networks over binary variables. Next, we identify a simplified characterization of EDML that further implies a simple fixed-point algorithm for the convex optimization problem that underlies it. This characterization further reveals a connection between EDML and EM: a fixed point of EDML is a fixed point of EM, and vice versa. We thus identify also a new characterization of EM fixed points, but in the semantics of EDML. Finally, we propose a hybrid EDML/EM algorithm that takes advantage of the improved empirical convergence behavior of EDML, while maintaining the monotonic improvement property of EM.
AIOct 16, 2012
Lifted Relax, Compensate and then Recover: From Approximate to Exact Lifted Probabilistic InferenceGuy Van den Broeck, Arthur Choi, Adnan Darwiche
We propose an approach to lifted approximate inference for first-order probabilistic models, such as Markov logic networks. It is based on performing exact lifted inference in a simplified first-order model, which is found by relaxing first-order constraints, and then compensating for the relaxation. These simplified models can be incrementally improved by carefully recovering constraints that have been relaxed, also at the first-order level. This leads to a spectrum of approximations, with lifted belief propagation on one end, and exact lifted inference on the other. We discuss how relaxation, compensation, and recovery can be performed, all at the firstorder level, and show empirically that our approach substantially improves on the approximations of both propositional solvers and lifted belief propagation.
AIJul 4, 2012
On Bayesian Network Approximation by Edge DeletionArthur Choi, Hei Chan, Adnan Darwiche
We consider the problem of deleting edges from a Bayesian network for the purpose of simplifying models in probabilistic inference. In particular, we propose a new method for deleting network edges, which is based on the evidence at hand. We provide some interesting bounds on the KL-divergence between original and approximate networks, which highlight the impact of given evidence on the quality of approximation and shed some light on good and bad candidates for edge deletion. We finally demonstrate empirically the promise of the proposed edge deletion technique as a basis for approximate inference.
AIJun 27, 2012
A Variational Approach for Approximating Bayesian Networks by Edge DeletionArthur Choi, Adnan Darwiche
We consider in this paper the formulation of approximate inference in Bayesian networks as a problem of exact inference on an approximate network that results from deleting edges (to reduce treewidth). We have shown in earlier work that deleting edges calls for introducing auxiliary network parameters to compensate for lost dependencies, and proposed intuitive conditions for determining these parameters. We have also shown that our method corresponds to IBP when enough edges are deleted to yield a polytree, and corresponds to some generalizations of IBP when fewer edges are deleted. In this paper, we propose a different criteria for determining auxiliary parameters based on optimizing the KL-divergence between the original and approximate networks. We discuss the relationship between the two methods for selecting parameters, shedding new light on IBP and its generalizations. We also discuss the application of our new method to approximating inference problems which are exponential in constrained treewidth, including MAP and nonmyopic value of information.
AIJun 20, 2012
Node Splitting: A Scheme for Generating Upper Bounds in Bayesian NetworksArthur Choi, Mark Chavira, Adnan Darwiche
We formulate in this paper the mini-bucket algorithm for approximate inference in terms of exact inference on an approximate model produced by splitting nodes in a Bayesian network. The new formulation leads to a number of theoretical and practical implications. First, we show that branchand- bound search algorithms that use minibucket bounds may operate in a drastically reduced search space. Second, we show that the proposed formulation inspires new minibucket heuristics and allows us to analyze existing heuristics from a new perspective. Finally, we show that this new formulation allows mini-bucket approximations to benefit from recent advances in exact inference, allowing one to significantly increase the reach of these approximations.
LGJun 13, 2012
Approximating the Partition Function by Deleting and then Correcting for Model EdgesArthur Choi, Adnan Darwiche
We propose an approach for approximating the partition function which is based on two steps: (1) computing the partition function of a simplified model which is obtained by deleting model edges, and (2) rectifying the result by applying an edge-by-edge correction. The approach leads to an intuitive framework in which one can trade-off the quality of an approximation with the complexity of computing it. It also includes the Bethe free energy approximation as a degenerate case. We develop the approach theoretically in this paper and provide a number of empirical results that reveal its practical utility.
AIFeb 14, 2012
EDML: A Method for Learning Parameters in Bayesian NetworksArthur Choi, Khaled S. Refaat, Adnan Darwiche
We propose a method called EDML for learning MAP parameters in binary Bayesian networks under incomplete data. The method assumes Beta priors and can be used to learn maximum likelihood parameters when the priors are uninformative. EDML exhibits interesting behaviors, especially when compared to EM. We introduce EDML, explain its origin, and study some of its properties both analytically and empirically.