Moises Goldszmidt

14papers

2,259citations

Novelty43%

AI Score27

Ranked #163,068 of 205,806 authors (top 79%)#10,958 in AI (top 77%)

14 Papers

LGNov 17, 2022

Active Learning with Expected Error Reduction

Stephen Mussmann, Julia Reisler, Daniel Tsai et al. · apple-ml

Active learning has been studied extensively as a method for efficient data collection. Among the many approaches in literature, Expected Error Reduction (EER) (Roy and McCallum) has been shown to be an effective method for active learning: select the candidate sample that, in expectation, maximally decreases the error on an unlabeled set. However, EER requires the model to be retrained for every candidate sample and thus has not been widely used for modern deep neural networks due to this large computational cost. In this paper we reformulate EER under the lens of Bayesian active learning and derive a computationally efficient version that can use any Bayesian parameter sampling method (such as arXiv:1506.02142). We then compare the empirical performance of our method using Monte Carlo dropout for parameter sampling against state of the art methods in the deep active learning literature. Experiments are performed on four standard benchmark datasets and three WILDS datasets (arXiv:2012.07421). The results indicate that our method outperforms all other methods except one in the data shift scenario: a model dependent, non-information theoretic method that requires an order of magnitude higher computational cost (arXiv:1906.03671).

AIApr 13, 2013

Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (2000)

Craig Boutilier, Moises Goldszmidt

This is the Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence, which was held in San Francisco, CA, June 30 - July 3, 2000

AIMar 27, 2013

Deciding Consistency of Databases Containing Defeasible and Strict Information

Moises Goldszmidt, Judea Pearl

We propose a norm of consistency for a mixed set of defeasible and strict sentences, based on a probabilistic semantics. This norm establishes a clear distinction between knowledge bases depicting exceptions and those containing outright contradictions. We then define a notion of entailment based also on probabilistic considerations and provide a characterization of the relation between consistency and entailment. We derive necessary and sufficient conditions for consistency, and provide a simple decision procedure for testing consistency and deciding whether a sentence is entailed by a database. Finally, it is shown that if al1 sentences are Horn clauses, consistency and entailment can be tested in polynomial time.

AIMar 13, 2013

Reasoning With Qualitative Probabilities Can Be Tractable

Moises Goldszmidt, Judea Pearl

We recently described a formalism for reasoning with if-then rules that re expressed with different levels of firmness [18]. The formalism interprets these rules as extreme conditional probability statements, specifying orders of magnitude of disbelief, which impose constraints over possible rankings of worlds. It was shown that, once we compute a priority function Z+ on the rules, the degree to which a given query is confirmed or denied can be computed in O(log n`) propositional satisfiability tests, where n is the number of rules in the knowledge base. In this paper, we show that computing Z+ requires O(n2 X log n) satisfiability tests, not an exponential number as was conjectured in [18], which reduces to polynomial complexity in the case of Horn expressions. We also show how reasoning with imprecise observations can be incorporated in our formalism and how the popular notions of belief revision and epistemic entrenchment are embodied naturally and tractably.

AIFeb 27, 2013

On the Relation between Kappa Calculus and Probabilistic Reasoning

Adnan Darwiche, Moises Goldszmidt

We study the connection between kappa calculus and probabilistic reasoning in diagnosis applications. Specifically, we abstract a probabilistic belief network for diagnosing faults into a kappa network and compare the ordering of faults computed using both methods. We show that, at least for the example examined, the ordering of faults coincide as long as all the causal relations in the original probabilistic network are taken into account. We also provide a formal analysis of some network structures where the two methods will differ. Both kappa rankings and infinitesimal probabilities have been used extensively to study default reasoning and belief revision. But little has been done on utilizing their connection as outlined above. This is partly because the relation between kappa and probability calculi assumes that probabilities are arbitrarily close to one (or zero). The experiments in this paper investigate this relation when this assumption is not satisfied. The reported results have important implications on the use of kappa rankings to enhance the knowledge engineering of uncertainty models.

AIFeb 27, 2013

Action Networks: A Framework for Reasoning about Actions and Change under Uncertainty

Adnan Darwiche, Moises Goldszmidt

This work proposes action networks as a semantically well-founded framework for reasoning about actions and change under uncertainty. Action networks add two primitives to probabilistic causal networks: controllable variables and persistent variables. Controllable variables allow the representation of actions as directly setting the value of specific events in the domain, subject to preconditions. Persistent variables provide a canonical model of persistence according to which both the state of a variable and the causal mechanism dictating its value persist over time unless intervened upon by an action (or its consequences). Action networks also allow different methods for quantifying the uncertainty in causal relationships, which go beyond traditional probabilistic quantification. This paper describes both recent results and work in progress.

AIFeb 20, 2013

Fast Belief Update Using Order-of-Magnitude Probabilities

Moises Goldszmidt

We present an algorithm, called Predict, for updating beliefs in causal networks quantified with order-of-magnitude probabilities. The algorithm takes advantage of both the structure and the quantification of the network and presents a polynomial asymptotic complexity. Predict exhibits a conservative behavior in that it is always sound but not always complete. We provide sufficient conditions for completeness and present algorithms for testing these conditions and for computing a complete set of plausible values. We propose Predict as an efficient method to estimate probabilistic values and illustrate its use in conjunction with two known algorithms for probabilistic inference. Finally, we describe an application of Predict to plan evaluation, present experimental results, and discuss issues regarding its use with conditional logics of belief, and in the characterization of irrelevance.

AIFeb 13, 2013

Learning Bayesian Networks with Local Structure

Nir Friedman, Moises Goldszmidt

In this paper we examine a novel addition to the known methods for learning Bayesian networks from data that improves the quality of the learned networks. Our approach explicitly represents and learns the local structure in the conditional probability tables (CPTs), that quantify these networks. This increases the space of possible models, enabling the representation of CPTs with a variable number of parameters that depends on the learned local structures. The resulting learning procedure is capable of inducing models that better emulate the real complexity of the interactions present in the data. We describe the theoretical foundations and practical aspects of learning local structures, as well as an empirical evaluation of the proposed method. This evaluation indicates that learning curves characterizing the procedure that exploits the local structure converge faster than these of the standard procedure. Our results also show that networks learned with local structure tend to be more complex (in terms of arcs), yet require less parameters.

AIFeb 13, 2013

Context-Specific Independence in Bayesian Networks

Craig Boutilier, Nir Friedman, Moises Goldszmidt et al.

Bayesian networks provide a language for qualitatively representing the conditional independence properties of a distribution. This allows a natural and compact representation of the distribution, eases knowledge acquisition, and supports effective inference algorithms. It is well-known, however, that there are certain independencies that we cannot capture qualitatively within the Bayesian network structure: independencies that hold only in certain contexts, i.e., given a specific assignment of values to certain variables. In this paper, we propose a formal notion of context-specific independence (CSI), based on regularities in the conditional probability tables (CPTs) at a node. We present a technique, analogous to (and based on) d-separation, for determining when such independence holds in a given network. We then focus on a particular qualitative representation scheme - tree-structured CPTs - for capturing CSI. We suggest ways in which this representation can be used to support effective inference algorithms. In particular, we present a structural decomposition of the resulting network which can improve the performance of clustering algorithms, and an alternative algorithm based on cutset conditioning.

AIFeb 6, 2013

Sequential Update of Bayesian Network Structure

Nir Friedman, Moises Goldszmidt

There is an obvious need for improving the performance and accuracy of a Bayesian network as new data is observed. Because of errors in model construction and changes in the dynamics of the domains, we cannot afford to ignore the information in new data. While sequential update of parameters for a fixed structure can be accomplished using standard techniques, sequential update of network structure is still an open problem. In this paper, we investigate sequential update of Bayesian networks were both parameters and structure are expected to change. We introduce a new approach that allows for the flexible manipulation of the tradeoff between the quality of the learned networks and the amount of information that is maintained about past observations. We formally describe our approach including the necessary modifications to the scoring functions for learning Bayesian networks, evaluate its effectiveness through an empirical study, and extend it to the case of missing data.

LGJan 23, 2013

Data Analysis with Bayesian Networks: A Bootstrap Approach

Nir Friedman, Moises Goldszmidt, Abraham Wyner

In recent years there has been significant progress in algorithms and methods for inducing Bayesian networks from data. However, in complex data analysis problems, we need to go beyond being satisfied with inducing networks with high scores. We need to provide confidence measures on features of these networks: Is the existence of an edge between two nodes warranted? Is the Markov blanket of a given node robust? Can we say something about the ordering of the variables? We should be able to address these questions, even when the amount of data is not enough to induce a high scoring network. In this paper we propose Efron's Bootstrap as a computationally efficient approach for answering these questions. In addition, we propose to use these confidence measures to induce better structures from the data, and to detect the presence of latent variables.

AIJan 23, 2013

Continuous Value Function Approximation for Sequential Bidding Policies

Craig Boutilier, Moises Goldszmidt, Bikash Sabata

Market-based mechanisms such as auctions are being studied as an appropriate means for resource allocation in distributed and mulitagent decision problems. When agents value resources in combination rather than in isolation, they must often deliberate about appropriate bidding strategies for a sequence of auctions offering resources of interest. We briefly describe a discrete dynamic programming model for constructing appropriate bidding policies for resources exhibiting both complementarities and substitutability. We then introduce a continuous approximation of this model, assuming that money (or the numeraire good) is infinitely divisible. Though this has the potential to reduce the computational cost of computing policies, value functions in the transformed problem do not have a convenient closed form representation. We develop {em grid-based} approximation for such value functions, representing value functions using piecewise linear approximations. We show that these methods can offer significant computational savings with relatively small cost in solution quality.

SEJun 20, 2012

Making life better one large system at a time: Challenges for UAI research

Moises Goldszmidt

The rapid growth and diversity in service offerings and the ensuing complexity of information technology ecosystems present numerous management challenges (both operational and strategic). Instrumentation and measurement technology is, by and large, keeping pace with this development and growth. However, the algorithms, tools, and technology required to transform the data into relevant information for decision making are not. The claim in this paper (and the invited talk) is that the line of research conducted in Uncertainty in Artificial Intelligence is very well suited to address the challenges and close this gap. I will support this claim and discuss open problems using recent examples in diagnosis, model discovery, and policy optimization on three real life distributed systems.

AIJun 13, 2012

CT-NOR: Representing and Reasoning About Events in Continuous Time

Aleksandr Simma, Moises Goldszmidt, John MacCormick et al.

We present a generative model for representing and reasoning about the relationships among events in continuous time. We apply the model to the domain of networked and distributed computing environments where we fit the parameters of the model from timestamp observations, and then use hypothesis testing to discover dependencies between the events and changes in behavior for monitoring and diagnosis. After introducing the model, we present an EM algorithm for fitting the parameters and then present the hypothesis testing approach for both dependence discovery and change-point detection. We validate the approach for both tasks using real data from a trace of network events at Microsoft Research Cambridge. Finally, we formalize the relationship between the proposed model and the noisy-or gate for cases when time can be discretized.