CVMar 16Code
A Skill-augmented Agentic Framework and Benchmark for Multi-Video UnderstandingYue Zhang, Liqiang Jing, Jia Li et al.
Multimodal Large Language Models have achieved strong performance in single-video understanding, yet their ability to reason across multiple videos remains limited. Existing approaches typically concatenate multiple videos into a single input and perform direct inference, which introduces training-inference mismatch, information loss from frame compression, and a lack of explicit cross-video coordination. Meanwhile, current multi-video benchmarks primarily emphasize event-level comparison, leaving identity-level matching, fine-grained discrimination, and structured multi-step reasoning underexplored. To address these gaps, we introduce MVX-Bench, a Multi-Video Cross-Dimension Benchmark that reformulates 11 classical computer vision tasks into a unified multi-video question-answering framework, comprising 1,442 questions over 4,255 videos from diverse real-world datasets. We further propose SAMA, a Skill-Augmented Agentic Framework for Multi-Video Understanding, which integrates visual tools, task-specific skills, and a conflict-aware verification mechanism to enable iterative and structured reasoning. Experimental results show that SAMA outperforms strong open-source baselines and GPT on MVX-Bench, and ablations validate the effectiveness of skill design and conflict resolution.
CVMar 13Code
Towards Spatio-Temporal World Scene Graph Generation from Monocular VideosRohith Peddi, Saurabh, Shravan Shanmugam et al.
Spatio-temporal scene graphs provide a principled representation for modeling evolving object interactions, yet existing methods remain fundamentally frame-centric: they reason only about currently visible objects, discard entities upon occlusion, and operate in 2D. To address this, we first introduce ActionGenome4D, a dataset that upgrades Action Genome videos into 4D scenes via feed-forward 3D reconstruction, world-frame oriented bounding boxes for every object involved in actions, and dense relationship annotations including for objects that are temporarily unobserved due to occlusion or camera motion. Building on this data, we formalize World Scene Graph Generation (WSGG), the task of constructing a world scene graph at each timestamp that encompasses all interacting objects in the scene, both observed and unobserved. We then propose three complementary methods, each exploring a different inductive bias for reasoning about unobserved objects: PWG (Persistent World Graph), which implements object permanence via a zero-order feature buffer; MWAE (Masked World Auto-Encoder), which reframes unobserved-object reasoning as masked completion with cross-view associative retrieval; and 4DST (4D Scene Transformer), which replaces the static buffer with differentiable per-object temporal attention enriched by 3D motion and camera-pose features. We further design and evaluate the performance of strong open-source Vision-Language Models on the WSGG task via a suite of Graph RAG-based approaches, establishing baselines for unlocalized relationship prediction. WSGG thus advances video scene understanding toward world-centric, temporally persistent, and interpretable scene reasoning.
LGFeb 1, 2023
Deep Dependency Networks for Multi-Label ClassificationShivvrat Arya, Yu Xiang, Vibhav Gogate
We propose a simple approach which combines the strengths of probabilistic graphical models and deep learning architectures for solving the multi-label classification task, focusing specifically on image and video data. First, we show that the performance of previous approaches that combine Markov Random Fields with neural networks can be modestly improved by leveraging more powerful methods such as iterative join graph propagation, integer linear programming, and $\ell_1$ regularization-based structure learning. Then we propose a new modeling framework called deep dependency networks, which augments a dependency network, a model that is easy to train and learns more accurate dependencies but is limited to Gibbs sampling for inference, to the output layer of a neural network. We show that despite its simplicity, jointly learning this new architecture yields significant improvements in performance over the baseline neural network. In particular, our experimental evaluation on three video activity classification datasets: Charades, Textually Annotated Cooking Scenes (TACoS), and Wetlab, and three multi-label image classification datasets: MS-COCO, PASCAL VOC, and NUS-WIDE show that deep dependency networks are almost always superior to pure neural architectures that do not use dependency networks.
CVDec 22, 2023
CaptainCook4D: A Dataset for Understanding Errors in Procedural ActivitiesRohith Peddi, Shivvrat Arya, Bharath Challa et al.
Following step-by-step procedures is an essential component of various activities carried out by individuals in their daily lives. These procedures serve as a guiding framework that helps to achieve goals efficiently, whether it is assembling furniture or preparing a recipe. However, the complexity and duration of procedural activities inherently increase the likelihood of making errors. Understanding such procedural activities from a sequence of frames is a challenging task that demands an accurate interpretation of visual information and the ability to reason about the structure of the activity. To this end, we collect a new egocentric 4D dataset, CaptainCook4D, comprising 384 recordings (94.5 hours) of people performing recipes in real kitchen environments. This dataset consists of two distinct types of activity: one in which participants adhere to the provided recipe instructions and another in which they deviate and induce errors. We provide 5.3K step annotations and 10K fine-grained action annotations and benchmark the dataset for the following tasks: supervised error recognition, multistep localization, and procedure learning
CVDec 19, 2024
Defeasible Visual Entailment: Benchmark, Evaluator, and Reward-Driven OptimizationYue Zhang, Liqiang Jing, Vibhav Gogate
We introduce a new task called Defeasible Visual Entailment (DVE), where the goal is to allow the modification of the entailment relationship between an image premise and a text hypothesis based on an additional update. While this concept is well-established in Natural Language Inference, it remains unexplored in visual entailment. At a high level, DVE enables models to refine their initial interpretations, leading to improved accuracy and reliability in various applications such as detecting misleading information in images, enhancing visual question answering, and refining decision-making processes in autonomous systems. Existing metrics do not adequately capture the change in the entailment relationship brought by updates. To address this, we propose a novel inference-aware evaluator designed to capture changes in entailment strength induced by updates, using pairwise contrastive learning and categorical information learning. Additionally, we introduce a reward-driven update optimization method to further enhance the quality of updates generated by multimodal models. Experimental results demonstrate the effectiveness of our proposed evaluator and optimization method.
CVMar 7, 2024
Towards Scene Graph AnticipationRohith Peddi, Saksham Singh, Saurabh et al.
Spatio-temporal scene graphs represent interactions in a video by decomposing scenes into individual objects and their pair-wise temporal relationships. Long-term anticipation of the fine-grained pair-wise relationships between objects is a challenging problem. To this end, we introduce the task of Scene Graph Anticipation (SGA). We adapt state-of-the-art scene graph generation methods as baselines to anticipate future pair-wise relationships between objects and propose a novel approach SceneSayer. In SceneSayer, we leverage object-centric representations of relationships to reason about the observed video frames and model the evolution of relationships between objects. We take a continuous time perspective and model the latent dynamics of the evolution of object interactions using concepts of NeuralODE and NeuralSDE, respectively. We infer representations of future relationships by solving an Ordinary Differential Equation and a Stochastic Differential Equation, respectively. Extensive experimentation on the Action Genome dataset validates the efficacy of the proposed methods.
ROMar 8, 2024
Grasping Trajectory Optimization with Point CloudsYu Xiang, Sai Haneesh Allu, Rohith Peddi et al.
We introduce a new trajectory optimization method for robotic grasping based on a point-cloud representation of robots and task spaces. In our method, robots are represented by 3D points on their link surfaces. The task space of a robot is represented by a point cloud that can be obtained from depth sensors. Using the point-cloud representation, goal reaching in grasping can be formulated as point matching, while collision avoidance can be efficiently achieved by querying the signed distance values of the robot points in the signed distance field of the scene points. Consequently, a constrained nonlinear optimization problem is formulated to solve the joint motion and grasp planning problem. The advantage of our method is that the point-cloud representation is general to be used with any robot in any environment. We demonstrate the effectiveness of our method by performing experiments on a tabletop scene and a shelf scene for grasping with a Fetch mobile manipulator and a Franka Panda arm. The project page is available at \url{https://irvlutd.github.io/GraspTrajOpt}
CVNov 20, 2024
Towards Unbiased and Robust Spatio-Temporal Scene Graph Generation and AnticipationRohith Peddi, Saurabh, Ayush Abhay Shrivastava et al.
Spatio-Temporal Scene Graphs (STSGs) provide a concise and expressive representation of dynamic scenes by modeling objects and their evolving relationships over time. However, real-world visual relationships often exhibit a long-tailed distribution, causing existing methods for tasks like Video Scene Graph Generation (VidSGG) and Scene Graph Anticipation (SGA) to produce biased scene graphs. To this end, we propose ImparTail, a novel training framework that leverages loss masking and curriculum learning to mitigate bias in the generation and anticipation of spatio-temporal scene graphs. Unlike prior methods that add extra architectural components to learn unbiased estimators, we propose an impartial training objective that reduces the dominance of head classes during learning and focuses on underrepresented tail relationships. Our curriculum-driven mask generation strategy further empowers the model to adaptively adjust its bias mitigation strategy over time, enabling more balanced and robust estimations. To thoroughly assess performance under various distribution shifts, we also introduce two new tasks Robust Spatio-Temporal Scene Graph Generation and Robust Scene Graph Anticipation offering a challenging benchmark for evaluating the resilience of STSG models. Extensive experiments on the Action Genome dataset demonstrate the superior unbiased performance and robustness of our method compared to existing baselines.
LGApr 17, 2024
Learning to Solve the Constrained Most Probable Explanation Task in Probabilistic Graphical ModelsShivvrat Arya, Tahrima Rahman, Vibhav Gogate
We propose a self-supervised learning approach for solving the following constrained optimization task in log-linear models or Markov networks. Let $f$ and $g$ be two log-linear models defined over the sets $\mathbf{X}$ and $\mathbf{Y}$ of random variables respectively. Given an assignment $\mathbf{x}$ to all variables in $\mathbf{X}$ (evidence) and a real number $q$, the constrained most-probable explanation (CMPE) task seeks to find an assignment $\mathbf{y}$ to all variables in $\mathbf{Y}$ such that $f(\mathbf{x}, \mathbf{y})$ is maximized and $g(\mathbf{x}, \mathbf{y})\leq q$. In our proposed self-supervised approach, given assignments $\mathbf{x}$ to $\mathbf{X}$ (data), we train a deep neural network that learns to output near-optimal solutions to the CMPE problem without requiring access to any pre-computed solutions. The key idea in our approach is to use first principles and approximate inference methods for CMPE to derive novel loss functions that seek to push infeasible solutions towards feasible ones and feasible solutions towards optimal ones. We analyze the properties of our proposed method and experimentally demonstrate its efficacy on several benchmark problems.
LGFeb 6, 2024
Neural Network Approximators for Marginal MAP in Probabilistic CircuitsShivvrat Arya, Tahrima Rahman, Vibhav Gogate
Probabilistic circuits (PCs) such as sum-product networks efficiently represent large multi-variate probability distributions. They are preferred in practice over other probabilistic representations such as Bayesian and Markov networks because PCs can solve marginal inference (MAR) tasks in time that scales linearly in the size of the network. Unfortunately, the maximum-a-posteriori (MAP) and marginal MAP (MMAP) tasks remain NP-hard in these models. Inspired by the recent work on using neural networks for generating near-optimal solutions to optimization problems such as integer linear programming, we propose an approach that uses neural networks to approximate (M)MAP inference in PCs. The key idea in our approach is to approximate the cost of an assignment to the query variables using a continuous multilinear function, and then use the latter as a loss function. The two main benefits of our new method are that it is self-supervised and after the neural network is learned, it requires only linear time to output a solution. We evaluate our new approach on several benchmark datasets and show that it outperforms three competing linear time approximations, max-product inference, max-marginal inference and sequential estimation, which are used in practice to solve MMAP tasks in PCs.
LGApr 17, 2024
Deep Dependency Networks and Advanced Inference Schemes for Multi-Label ClassificationShivvrat Arya, Yu Xiang, Vibhav Gogate
We present a unified framework called deep dependency networks (DDNs) that combines dependency networks and deep learning architectures for multi-label classification, with a particular emphasis on image and video data. The primary advantage of dependency networks is their ease of training, in contrast to other probabilistic graphical models like Markov networks. In particular, when combined with deep learning architectures, they provide an intuitive, easy-to-use loss function for multi-label classification. A drawback of DDNs compared to Markov networks is their lack of advanced inference schemes, necessitating the use of Gibbs sampling. To address this challenge, we propose novel inference schemes based on local search and integer linear programming for computing the most likely assignment to the labels given observations. We evaluate our novel methods on three video datasets (Charades, TACoS, Wetlab) and three image datasets (MS-COCO, PASCAL VOC, NUS-WIDE), comparing their performance with (a) basic neural architectures and (b) neural architectures combined with Markov networks equipped with advanced inference and learning techniques. Our results demonstrate the superiority of our new DDN methods over the two competing approaches.
CVJun 27, 2025
Can Video Large Multimodal Models Think Like Doubters-or Double-Down: A Study on Defeasible Video EntailmentYue Zhang, Jilei Sun, Yunhui Guo et al.
Video Large Multimodal Models (VLMMs) have made impressive strides in understanding video content, but they often struggle with abstract and adaptive reasoning-the ability to revise their interpretations when new information emerges. In reality, conclusions are rarely set in stone; additional context can strengthen or weaken an initial inference. To address this, we introduce Defeasible Video Entailment (DVidE), a new task that challenges models to think like doubters, constantly updating their reasoning based on evolving evidence. In DVidE, given a video premise and a textual hypothesis, models must determine whether a new update strengthens or weakens the hypothesis (classification version) or generate a coherent update that modifies the entailment relationship (generation version). For solving the classification task, we propose the Chain of Counterfactual Thought framework, utilizing counterfactual reasoning, ASR-enhanced video content, and rationale refinement to reduce inference bias. For the generation task, we develop a framework that combines ASR output with a Large Language Model (LLM) to produce coherent, contextually relevant updates aligned with the intended strengthener or weakener goals. Additionally, we introduce a novel benchmark dataset, with strengthener/weakener annotations and an LLM-based evaluation metric specifically designed for assessing generative performance. Experimental results demonstrate significant improvements, highlighting our proposed method in enhancing dynamic reasoning capabilities of VLMMs.
HCMay 5, 2020
Don't Explain without Verifying Veracity: An Evaluation of Explainable AI with Video Activity RecognitionMahsan Nourani, Chiradeep Roy, Tahrima Rahman et al.
Explainable machine learning and artificial intelligence models have been used to justify a model's decision-making process. This added transparency aims to help improve user performance and understanding of the underlying model. However, in practice, explainable systems face many open questions and challenges. Specifically, designers might reduce the complexity of deep learning models in order to provide interpretability. The explanations generated by these simplified models, however, might not accurately justify and be truthful to the model. This can further add confusion to the users as they might not find the explanations meaningful with respect to the model predictions. Understanding how these explanations affect user behavior is an ongoing challenge. In this paper, we explore how explanation veracity affects user performance and agreement in intelligent systems. Through a controlled user study with an explainable activity recognition system, we compare variations in explanation veracity for a video review and querying task. The results suggest that low veracity explanations significantly decrease user performance and agreement compared to both accurate explanations and a system without explanations. These findings demonstrate the importance of accurate and understandable explanations and caution that poor explanations can sometimes be worse than no explanations with respect to their effect on user performance and reliance on an AI system.
LGJul 3, 2018
Domain Aware Markov Logic NetworksHappy Mittal, Ayush Bhardwaj, Vibhav Gogate et al.
Combining logic and probability has been a long stand- ing goal of AI research. Markov Logic Networks (MLNs) achieve this by attaching weights to formulas in first-order logic, and can be seen as templates for constructing features for ground Markov networks. Most techniques for learning weights of MLNs are domain-size agnostic, i.e., the size of the domain is not explicitly taken into account while learn- ing the parameters of the model. This often results in ex- treme probabilities when testing on domain sizes different from those seen during training. In this paper, we propose Domain Aware Markov logic Networks (DA-MLNs) which present a principled solution to this problem. While defin- ing the ground network distribution, DA-MLNs divide the ground feature weight by a scaling factor which is a function of the number of connections the ground atoms appearing in the feature are involved in. We show that standard MLNs fall out as a special case of our formalism when this func- tion evaluates to a constant equal to 1. Experiments on the benchmark Friends & Smokers domain show that our ap- proach results in significantly higher accuracies compared to existing methods when testing on domains whose sizes different from those seen during training.
AIJul 2, 2018
Lifted Marginal MAP InferenceVishal Sharma, Noman Ahmed Sheikh, Happy Mittal et al.
Lifted inference reduces the complexity of inference in relational probabilistic models by identifying groups of constants (or atoms) which behave symmetric to each other. A number of techniques have been proposed in the literature for lifting marginal as well MAP inference. We present the first application of lifting rules for marginal-MAP (MMAP), an important inference problem in models having latent (random) variables. Our main contribution is two fold: (1) we define a new equivalence class of (logical) variables, called Single Occurrence for MAX (SOM), and show that solution lies at extreme with respect to the SOM variables, i.e., predicate groundings differing only in the instantiation of the SOM variables take the same truth value (2) we define a sub-class {\em SOM-R} (SOM Reduce) and exploit properties of extreme assignments to show that MMAP inference can be performed by reducing the domain of SOM-R variables to a single constant.We refer to our lifting technique as the {\em SOM-R} rule for lifted MMAP. Combined with existing rules such as decomposer and binomial, this results in a powerful framework for lifted MMAP. Experiments on three benchmark domains show significant gains in both time and memory compared to ground inference as well as lifted approaches not using SOM-R.
MLJun 14, 2018
Scalable Neural Network Compression and Pruning Using Hard Clustering and L1 RegularizationYibo Yang, Nicholas Ruozzi, Vibhav Gogate
We propose a simple and easy to implement neural network compression algorithm that achieves results competitive with more complicated state-of-the-art methods. The key idea is to modify the original optimization problem by adding K independent Gaussian priors (corresponding to the k-means objective) over the network parameters to achieve parameter quantization, as well as an L1 penalty to achieve pruning. Unlike many existing quantization-based methods, our method uses hard clustering assignments of network parameters, which adds minimal change or overhead to standard network training. We also demonstrate experimentally that tying neural network parameters provides less gain in generalization performance than changing network architecture and connectivity patterns entirely.
AIJun 30, 2016
Lifted Region-Based Belief PropagationDavid Smith, Parag Singla, Vibhav Gogate
Due to the intractable nature of exact lifted inference, research has recently focused on the discovery of accurate and efficient approximate inference algorithms in Statistical Relational Models (SRMs), such as Lifted First-Order Belief Propagation. FOBP simulates propositional factor graph belief propagation without constructing the ground factor graph by identifying and lifting over redundant message computations. In this work, we propose a generalization of FOBP called Lifted Generalized Belief Propagation, in which both the region structure and the message structure can be lifted. This approach allows more of the inference to be performed intra-region (in the exact inference step of BP), thereby allowing simulation of propagation on a graph structure with larger region scopes and fewer edges, while still maintaining tractability. We demonstrate that the resulting algorithm converges in fewer iterations to more accurate results on a variety of SRMs.
AIMay 26, 2016
Probabilistic Inference Modulo TheoriesRodrigo de Salvo Braz, Ciaran O'Reilly, Vibhav Gogate et al.
We present SGDPLL(T), an algorithm that solves (among many other problems) probabilistic inference modulo theories, that is, inference problems over probabilistic models defined via a logic theory provided as a parameter (currently, propositional, equalities on discrete sorts, and inequalities, more specifically difference arithmetic, on bounded integers). While many solutions to probabilistic inference over logic representations have been proposed, SGDPLL(T) is simultaneously (1) lifted, (2) exact and (3) modulo theories, that is, parameterized by a background logic theory. This offers a foundation for extending it to rich logic languages such as data structures and relational data. By lifted, we mean algorithms with constant complexity in the domain size (the number of values that variables can take). We also detail a solver for summations with difference arithmetic and show experimental results from a scenario in which SGDPLL(T) is much faster than a state-of-the-art probabilistic solver.
AIJan 15, 2014
Join-Graph Propagation AlgorithmsRobert Mateescu, Kalev Kask, Vibhav Gogate et al.
The paper investigates parameterized approximate message-passing schemes that are based on bounded inference and are inspired by Pearl's belief propagation algorithm (BP). We start with the bounded inference mini-clustering algorithm and then move to the iterative scheme called Iterative Join-Graph Propagation (IJGP), that combines both iteration and bounded inference. Algorithm IJGP belongs to the class of Generalized Belief Propagation algorithms, a framework that allowed connections with approximate algorithms from statistical physics and is shown empirically to surpass the performance of mini-clustering and belief propagation, as well as a number of other state-of-the-art algorithms on several classes of networks. We also provide insight into the accuracy of iterative BP and IJGP by relating these algorithms to well known classes of constraint propagation schemes.
AISep 26, 2013
Dynamic Blocking and Collapsing for Gibbs SamplingDeepak Venugopal, Vibhav Gogate
In this paper, we investigate combining blocking and collapsing -- two widely used strategies for improving the accuracy of Gibbs sampling -- in the context of probabilistic graphical models (PGMs). We show that combining them is not straight-forward because collapsing (or eliminating variables) introduces new dependencies in the PGM and in computation-limited settings, this may adversely affect blocking. We therefore propose a principled approach for tackling this problem. Specifically, we develop two scoring functions, one each for blocking and collapsing, and formulate the problem of partitioning the variables in the PGM into blocked and collapsed subsets as simultaneously maximizing both scoring functions (i.e., a multi-objective optimization problem). We propose a dynamic, greedy algorithm for approximately solving this intractable optimization problem. Our dynamic algorithm periodically updates the partitioning into blocked and collapsed variables by leveraging correlation statistics gathered from the generated samples and enables rapid mixing by blocking together and collapsing highly correlated variables. We demonstrate experimentally the clear benefit of our dynamic approach: as more samples are drawn, our dynamic approach significantly outperforms static graph-based approaches by an order of magnitude in terms of accuracy.
AISep 26, 2013
Structured Message PassingVibhav Gogate, Pedro Domingos
In this paper, we present structured message passing (SMP), a unifying framework for approximate inference algorithms that take advantage of structured representations such as algebraic decision diagrams and sparse hash tables. These representations can yield significant time and space savings over the conventional tabular representation when the message has several identical values (context-specific independence) or zeros (determinism) or both in its range. Therefore, in order to fully exploit the power of structured representations, we propose to artificially introduce context-specific independence and determinism in the messages. This yields a new class of powerful approximate inference algorithms which includes popular algorithms such as cluster-graph Belief propagation (BP), expectation propagation and particle BP as special cases. We show that our new algorithms introduce several interesting bias-variance trade-offs. We evaluate these trade-offs empirically and demonstrate that our new algorithms are more accurate and scalable than state-of-the-art techniques.
DSJul 11, 2012
A Complete Anytime Algorithm for TreewidthVibhav Gogate, Rina Dechter
In this paper, we present a Branch and Bound algorithm called QuickBB for computing the treewidth of an undirected graph. This algorithm performs a search in the space of perfect elimination ordering of vertices of the graph. The algorithm uses novel pruning and propagation techniques which are derived from the theory of graph minors and graph isomorphism. We present a new algorithm called minor-min-width for computing a lower bound on treewidth that is used within the branch and bound algorithm and which improves over earlier available lower bounds. Empirical evaluation of QuickBB on randomly generated graphs and benchmarks in Graph Coloring and Bayesian Networks shows that it is consistently better than complete algorithms like QuickTree [Shoikhet and Geiger, 1997] in terms of cpu time. QuickBB also has good anytime performance, being able to generate a better upper bound on treewidth of some graphs whose optimal treewidth could not be computed up to now.
AIJul 4, 2012
Approximate Inference Algorithms for Hybrid Bayesian Networks with Discrete ConstraintsVibhav Gogate, Rina Dechter
In this paper, we consider Hybrid Mixed Networks (HMN) which are Hybrid Bayesian Networks that allow discrete deterministic information to be modeled explicitly in the form of constraints. We present two approximate inference algorithms for HMNs that integrate and adjust well known algorithmic principles such as Generalized Belief Propagation, Rao-Blackwellised Importance Sampling and Constraint Propagation to address the complexity of modeling and reasoning in HMNs. We demonstrate the performance of our approximate inference algorithms on randomly generated HMNs.
AIJul 4, 2012
Modeling Transportation Routines using Hybrid Dynamic Mixed NetworksVibhav Gogate, Rina Dechter, Bozhena Bidyuk et al.
This paper describes a general framework called Hybrid Dynamic Mixed Networks (HDMNs) which are Hybrid Dynamic Bayesian Networks that allow representation of discrete deterministic information in the form of constraints. We propose approximate inference algorithms that integrate and adjust well known algorithmic principles such as Generalized Belief Propagation, Rao-Blackwellised Particle Filtering and Constraint Propagation to address the complexity of modeling and reasoning in HDMNs. We use this framework to model a person's travel activity over time and to predict destination and routes given the current location. We present a preliminary empirical evaluation demonstrating the effectiveness of our modeling framework and algorithms using several variants of the activity model.
AIJun 20, 2012
Studies in Lower Bounding Probabilities of Evidence using the Markov InequalityVibhav Gogate, Bozhena Bidyuk, Rina Dechter
Computing the probability of evidence even with known error bounds is NP-hard. In this paper we address this hard problem by settling on an easier problem. We propose an approximation which provides high confidence lower bounds on probability of evidence but does not have any guarantees in terms of relative or absolute error. Our proposed approximation is a randomized importance sampling scheme that uses the Markov inequality. However, a straight-forward application of the Markov inequality may lead to poor lower bounds. We therefore propose several heuristic measures to improve its performance in practice. Empirical evaluation of our scheme with state-of- the-art lower bounding schemes reveals the promise of our approach.
AIJun 13, 2012
AND/OR Importance SamplingVibhav Gogate, Rina Dechter
The paper introduces AND/OR importance sampling for probabilistic graphical models. In contrast to importance sampling, AND/OR importance sampling caches samples in the AND/OR space and then extracts a new sample mean from the stored samples. We prove that AND/OR importance sampling may have lower variance than importance sampling; thereby providing a theoretical justification for preferring it over importance sampling. Our empirical evaluation demonstrates that AND/OR importance sampling is far more accurate than importance sampling in many cases.
AIMar 15, 2012
Formula-Based Probabilistic InferenceVibhav Gogate, Pedro Domingos
Computing the probability of a formula given the probabilities or weights associated with other formulas is a natural extension of logical inference to the probabilistic setting. Surprisingly, this problem has received little attention in the literature to date, particularly considering that it includes many standard inference problems as special cases. In this paper, we propose two algorithms for this problem: formula decomposition and conditioning, which is an exact method, and formula importance sampling, which is an approximate method. The latter is, to our knowledge, the first application of model counting to approximate probabilistic inference. Unlike conventional variable-based algorithms, our algorithms work in the dual realm of logical formulas. Theoretically, we show that our algorithms can greatly improve efficiency by exploiting the structural information in the formulas. Empirically, we show that they are indeed quite powerful, often achieving substantial performance gains over state-of-the-art schemes.
AIFeb 14, 2012
Probabilistic Theorem ProvingVibhav Gogate, Pedro Domingos
Many representation schemes combining first-order logic and probability have been proposed in recent years. Progress in unifying logical and probabilistic inference has been slower. Existing methods are mainly variants of lifted variable elimination and belief propagation, neither of which take logical structure into account. We propose the first method that has the full power of both graphical model inference and first-order theorem proving (in finite domains with Herbrand interpretations). We first define probabilistic theorem proving, their generalization, as the problem of computing the probability of a logical formula given the probabilities or weights of a set of formulas. We then show how this can be reduced to the problem of lifted weighted model counting, and develop an efficient algorithm for the latter. We prove the correctness of this algorithm, investigate its properties, and show how it generalizes previous approaches. Experiments show that it greatly outperforms lifted variable elimination when logical structure is present. Finally, we propose an algorithm for approximate probabilistic theorem proving, and show that it can greatly outperform lifted belief propagation.
AIFeb 14, 2012
Approximation by QuantizationVibhav Gogate, Pedro Domingos
Inference in graphical models consists of repeatedly multiplying and summing out potentials. It is generally intractable because the derived potentials obtained in this way can be exponentially large. Approximate inference techniques such as belief propagation and variational methods combat this by simplifying the derived potentials, typically by dropping variables from them. We propose an alternate method for simplifying potentials: quantizing their values. Quantization causes different states of a potential to have the same value, and therefore introduces context-specific independencies that can be exploited to represent the potential more compactly. We use algebraic decision diagrams (ADDs) to do this efficiently. We apply quantization and ADD reduction to variable elimination and junction tree propagation, yielding a family of bounded approximate inference schemes. Our experimental tests show that our new schemes significantly outperform state-of-the-art approaches on many benchmark instances.