Sriraam Natarajan

h-index23

45papers

359citations

Novelty48%

AI Score55

Ranked #24,400 of 201,326 authors (top 12%)#5,595 in LG (top 13%)

45 Papers

HCJul 19, 2022

Human-guided Collaborative Problem Solving: A Natural Language based Framework

Harsha Kokel, Mayukh Das, Rakibul Islam et al. · ibm-research

We consider the problem of human-machine collaborative problem solving as a planning task coupled with natural language communication. Our framework consists of three components -- a natural language engine that parses the language utterances to a formal representation and vice-versa, a concept learner that induces generalized concepts for plans based on limited interactions with the user, and an HTN planner that solves the task based on human interaction. We illustrate the ability of this framework to address the key challenges of collaborative problem solving by demonstrating it on a collaborative building task in a Minecraft-based blocksworld domain. The accompanied demo video is available at https://youtu.be/q1pWe4aahF0.

LGJun 16, 2022

Explainable Models via Compression of Tree Ensembles

Siwen Yan, Sriraam Natarajan, Saket Joshi et al.

Ensemble models (bagging and gradient-boosting) of relational decision trees have proved to be one of the most effective learning methods in the area of probabilistic logic models (PLMs). While effective, they lose one of the most important aspect of PLMs -- interpretability. In this paper we consider the problem of compressing a large set of learned trees into a single explainable model. To this effect, we propose CoTE -- Compression of Tree Ensembles -- that produces a single small decision list as a compressed representation. CoTE first converts the trees to decision lists and then performs the combination and compression with the aid of the original training set. An experimental evaluation demonstrates the effectiveness of CoTE in several benchmark relational data sets.

LGSep 10, 2023

Knowledge-based Refinement of Scientific Publication Knowledge Graphs

Siwen Yan, Phillip Odom, Sriraam Natarajan

We consider the problem of identifying authorship by posing it as a knowledge graph construction and refinement. To this effect, we model this problem as learning a probabilistic logic model in the presence of human guidance (knowledge-based learning). Specifically, we learn relational regression trees using functional gradient boosting that outputs explainable rules. To incorporate human knowledge, advice in the form of first-order clauses is injected to refine the trees. We demonstrate the usefulness of human knowledge both quantitatively and qualitatively in seven authorship domains.

AIFeb 7, 2023

MACOptions: Multi-Agent Learning with Centralized Controller and Options Framework

Alakh Aggarwal, Rishita Bansal, Parth Padalkar et al.

These days automation is being applied everywhere. In every environment, planning for the actions to be taken by the agents is an important aspect. In this paper, we plan to implement planning for multi-agents with a centralized controller. We compare three approaches: random policy, Q-learning, and Q-learning with Options Framework. We also show the effectiveness of planners by showing performance comparison between Q-Learning with Planner and without Planner.

AISep 18, 2023

Promoting Research Collaboration with Open Data Driven Team Recommendation in Response to Call for Proposals

Siva Likitha Valluru, Biplav Srivastava, Sai Teja Paladi et al.

Building teams and promoting collaboration are two very common business activities. An example of these are seen in the TeamingForFunding problem, where research institutions and researchers are interested to identify collaborative opportunities when applying to funding agencies in response to latter's calls for proposals. We describe a novel system to recommend teams using a variety of AI methods, such that (1) each team achieves the highest possible skill coverage that is demanded by the opportunity, and (2) the workload of distributing the opportunities is balanced amongst the candidate members. We address these questions by extracting skills latent in open data of proposal calls (demand) and researcher profiles (supply), normalizing them using taxonomies, and creating efficient algorithms that match demand to supply. We create teams to maximize goodness along a novel metric balancing short- and long-term objectives. We validate the success of our algorithms (1) quantitatively, by evaluating the recommended teams using a goodness score and find that more informed methods lead to recommendations of smaller number of teams but higher goodness, and (2) qualitatively, by conducting a large-scale user study at a college-wide level, and demonstrate that users overall found the tool very useful and relevant. Lastly, we evaluate our system in two diverse settings in US and India (of researchers and proposal calls) to establish generality of our approach, and deploy it at a major US university for routine use.

34.4LGMar 12

Geometry-Aware Probabilistic Circuits via Voronoi Tessellations

Sahil Sidheekh, Sriraam Natarajan

Probabilistic circuits (PCs) enable exact and tractable inference but employ data independent mixture weights that limit their ability to capture local geometry of the data manifold. We propose Voronoi tessellations (VT) as a natural way to incorporate geometric structure directly into the sum nodes of a PC. However, naïvely introducing such structure breaks tractability. We formalize this incompatibility and develop two complementary solutions: (1) an approximate inference framework that provides guaranteed lower and upper bounds for inference, and (2) a structural condition for VT under which exact tractable inference is recovered. Finally, we introduce a differentiable relaxation for VT that enables gradient-based learning and empirically validate the resulting approach on standard density estimation tasks.

23.2LGMay 15

Imitation learning for clinical decision support in pediatric ECMO

Fateme Golivand, Michael Skinner, Saurabh Mathur et al.

Pediatric critical care is a dynamic, high-stakes process involving constant monitoring and adjustments in life-saving treatments. Modeling these interventions is crucial for effective decision support. To address the challenges of high complexity and data scarcity in pediatric Extracorporeal Membrane Oxygenation (ECMO), we frame clinical decision-making as learning to act from trajectories, i.e., imitation learning that learns action models from observational data, with a key feature that actions are not directly observed. We consider TabPFN, a recent transformer-based approach for tabular data, and traditional baselines including XGBoost and Multi-Layer Perceptrons(MLPs) on real-world pediatric ECMO data to learn the action models. We find that the TabPFN-based approach consistently outperforms these classical baselines, supporting its use as a strong clinician-behavior baseline for pediatric ECMO decision support.

36.8LGMar 27

Context-specific Credibility-aware Multimodal Fusion with Conditional Probabilistic Circuits

Pranuthi Tenali, Sahil Sidheekh, Saurabh Mathur et al.

Multimodal fusion requires integrating information from multiple sources that may conflict depending on context. Existing fusion approaches typically rely on static assumptions about source reliability, limiting their ability to resolve conflicts when a modality becomes unreliable due to situational factors such as sensor degradation or class-specific corruption. We introduce C$^2$MF, a context-specfic credibility-aware multimodal fusion framework that models per-instance source reliability using a Conditional Probabilistic Circuit (CPC). We formalize instance-level reliability through Context-Specific Information Credibility (CSIC), a KL-divergence-based measure computed exactly from the CPC. CSIC generalizes conventional static credibility estimates as a special case, enabling principled and adaptive reliability assessment. To evaluate robustness under cross-modal conflicts, we propose the Conflict benchmark, in which class-specific corruptions deliberately induce discrepancies between different modalities. Experimental results show that C$^2$MF improves predictive accuracy by up to 29% over static-reliability baselines in high-noise settings, while preserving the interpretability advantages of probabilistic circuit-based fusion.

AISep 2, 2025Code

mFARM: Towards Multi-Faceted Fairness Assessment based on HARMs in Clinical Decision Support

Shreyash Adappanavar, Krithi Shailya, Gokul S Krishnan et al.

The deployment of Large Language Models (LLMs) in high-stakes medical settings poses a critical AI alignment challenge, as models can inherit and amplify societal biases, leading to significant disparities. Existing fairness evaluation methods fall short in these contexts as they typically use simplistic metrics that overlook the multi-dimensional nature of medical harms. This also promotes models that are fair only because they are clinically inert, defaulting to safe but potentially inaccurate outputs. To address this gap, our contributions are mainly two-fold: first, we construct two large-scale, controlled benchmarks (ED-Triage and Opioid Analgesic Recommendation) from MIMIC-IV, comprising over 50,000 prompts with twelve race x gender variants and three context tiers. Second, we propose a multi-metric framework - Multi-faceted Fairness Assessment based on hARMs ($mFARM$) to audit fairness for three distinct dimensions of disparity (Allocational, Stability, and Latent) and aggregate them into an $mFARM$ score. We also present an aggregated Fairness-Accuracy Balance (FAB) score to benchmark and observe trade-offs between fairness and prediction accuracy. We empirically evaluate four open-source LLMs (Mistral-7B, BioMistral-7B, Qwen-2.5-7B, Bio-LLaMA3-8B) and their finetuned versions under quantization and context variations. Our findings showcase that the proposed $mFARM$ metrics capture subtle biases more effectively under various settings. We find that most models maintain robust performance in terms of $mFARM$ score across varying levels of quantization but deteriorate significantly when the context is reduced. Our benchmarks and evaluation code are publicly released to enhance research in aligned AI for healthcare.

15.7LGMay 8

Neurosymbolic Imitation Learning with Human Guidance: A Privileged Information Approach

Nikhilesh Prabhakar, Varun Balaji, Athresh Karanam et al.

Imitation learning is widely used for learning to act in complex environments. While pure neural-based methods handle high dimensional data effectively, they suffer from the requirement of large number of samples and are prone to overfitting. Pure symbolic approaches, while generalize well, do not handle high-dimensional data effectively. We propose a neurosymbolic approach that achieves the best of both worlds, i.e, handling high-dimensional data while achieving generalization. The key advantage of our approach is that it can effectively exploit additional privileged information that is available only during training (in our case, gaze data). Our empirical evaluations demonstrate the effectiveness, efficiency and the generalization capability of our proposed approach.

LGFeb 1, 2024

Building Expressive and Tractable Probabilistic Generative Models: A Review

Sahil Sidheekh, Sriraam Natarajan

We present a comprehensive survey of the advancements and techniques in the field of tractable probabilistic generative modeling, primarily focusing on Probabilistic Circuits (PCs). We provide a unified perspective on the inherent trade-offs between expressivity and tractability, highlighting the design principles and algorithmic extensions that have enabled building expressive and efficient PCs, and provide a taxonomy of the field. We also discuss recent efforts to build deep and hybrid PCs by fusing notions from deep neural models, and outline the challenges and open questions that can guide future research in this evolving field.

LGMar 5, 2024

Credibility-Aware Multi-Modal Fusion Using Probabilistic Circuits

Sahil Sidheekh, Pranuthi Tenali, Saurabh Mathur et al.

We consider the problem of late multi-modal fusion for discriminative learning. Motivated by noisy, multi-source domains that require understanding the reliability of each data source, we explore the notion of credibility in the context of multi-modal fusion. We propose a combination function that uses probabilistic circuits (PCs) to combine predictive distributions over individual modalities. We also define a probabilistic measure to evaluate the credibility of each modality via inference queries over the PC. Our experimental evaluation demonstrates that our fusion method can reliably infer credibility while maintaining competitive performance with the state-of-the-art.

LGMay 3, 2024

A Unified Framework for Human-Allied Learning of Probabilistic Circuits

Athresh Karanam, Saurabh Mathur, Sahil Sidheekh et al.

Probabilistic Circuits (PCs) have emerged as an efficient framework for representing and learning complex probability distributions. Nevertheless, the existing body of research on PCs predominantly concentrates on data-driven parameter learning, often neglecting the potential of knowledge-intensive learning, a particular issue in data-scarce/knowledge-rich domains such as healthcare. To bridge this gap, we propose a novel unified framework that can systematically integrate diverse domain knowledge into the parameter learning process of PCs. Experiments on several benchmarks as well as real world datasets show that our proposed framework can both effectively and efficiently leverage domain knowledge to achieve superior performance compared to purely data-driven learning approaches.

LGMar 5

Recursive Inference Machines for Neural Reasoning

Mieszko Komisarczyk, Saurabh Mathur, Maurice Kraus et al.

Neural reasoners such as Tiny Recursive Models (TRMs) solve complex problems by combining neural backbones with specialized inference schemes. Such inference schemes have been a central component of stochastic reasoning systems, where inference rules are applied to a stochastic model to derive answers to complex queries. In this work, we bridge these two paradigms by introducing Recursive Inference Machines (RIMs), a neural reasoning framework that explicitly incorporates recursive inference mechanisms inspired by classical inference engines. We show that TRMs can be expressed as an instance of RIMs, allowing us to extend them through a reweighting component, yielding better performance on challenging reasoning benchmarks, including ARC-AGI-1, ARC-AGI-2, and Sudoku Extreme. Furthermore, we show that RIMs can be used to improve reasoning on other tasks, such as the classification of tabular data, outperforming TabPFNs.

LGOct 17, 2025

Human-Allied Relational Reinforcement Learning

Fateme Golivand Darvishvand, Hikaru Shindo, Sahil Sidheekh et al.

Reinforcement learning (RL) has experienced a second wind in the past decade. While incredibly successful in images and videos, these systems still operate within the realm of propositional tasks ignoring the inherent structure that exists in the problem. Consequently, relational extensions (RRL) have been developed for such structured problems that allow for effective generalization to arbitrary number of objects. However, they inherently make strong assumptions about the problem structure. We introduce a novel framework that combines RRL with object-centric representation to handle both structured and unstructured data. We enhance learning by allowing the system to actively query the human expert for guidance by explicitly modeling the uncertainty over the policy. Our empirical evaluation demonstrates the effectiveness and efficiency of our proposed approach.

CLOct 3, 2025

IndiCASA: A Dataset and Bias Evaluation Framework in LLMs Using Contrastive Embedding Similarity in the Indian Context

Santhosh G S, Akshay Govind S, Gokul S Krishnan et al.

Large Language Models (LLMs) have gained significant traction across critical domains owing to their impressive contextual understanding and generative capabilities. However, their increasing deployment in high stakes applications necessitates rigorous evaluation of embedded biases, particularly in culturally diverse contexts like India where existing embedding-based bias assessment methods often fall short in capturing nuanced stereotypes. We propose an evaluation framework based on a encoder trained using contrastive learning that captures fine-grained bias through embedding similarity. We also introduce a novel dataset - IndiCASA (IndiBias-based Contextually Aligned Stereotypes and Anti-stereotypes) comprising 2,575 human-validated sentences spanning five demographic axes: caste, gender, religion, disability, and socioeconomic status. Our evaluation of multiple open-weight LLMs reveals that all models exhibit some degree of stereotypical bias, with disability related biases being notably persistent, and religion bias generally lower likely due to global debiasing efforts demonstrating the need for fairer model development.

LGAug 7, 2025

Tractable Sharpness-Aware Learning of Probabilistic Circuits

Hrithik Suresh, Sahil Sidheekh, Vishnu Shreeram M. P et al.

Probabilistic Circuits (PCs) are a class of generative models that allow exact and tractable inference for a wide range of queries. While recent developments have enabled the learning of deep and expressive PCs, this increased capacity can often lead to overfitting, especially when data is limited. We analyze PC overfitting from a log-likelihood-landscape perspective and show that it is often caused by convergence to sharp optima that generalize poorly. Inspired by sharpness aware minimization in neural networks, we propose a Hessian-based regularizer for training PCs. As a key contribution, we show that the trace of the Hessian of the log-likelihood-a sharpness proxy that is typically intractable in deep neural networks-can be computed efficiently for PCs. Minimizing this Hessian trace induces a gradient-norm-based regularizer that yields simple closed-form parameter updates for EM, and integrates seamlessly with gradient based learning methods. Experiments on synthetic and real-world datasets demonstrate that our method consistently guides PCs toward flatter minima, improves generalization performance.

LGJul 6, 2025

Tractable Representation Learning with Probabilistic Circuits

Steven Braun, Sahil Sidheekh, Antonio Vergari et al.

Probabilistic circuits (PCs) are powerful probabilistic models that enable exact and tractable inference, making them highly suitable for probabilistic reasoning and inference tasks. While dominant in neural networks, representation learning with PCs remains underexplored, with prior approaches relying on external neural embeddings or activation-based encodings. To address this gap, we introduce autoencoding probabilistic circuits (APCs), a novel framework leveraging the tractability of PCs to model probabilistic embeddings explicitly. APCs extend PCs by jointly modeling data and embeddings, obtaining embedding representations through tractable probabilistic inference. The PC encoder allows the framework to natively handle arbitrary missing data and is seamlessly integrated with a neural decoder in a hybrid, end-to-end trainable architecture enabled by differentiable sampling. Our empirical evaluation demonstrates that APCs outperform existing PC-based autoencoding methods in reconstruction quality, generate embeddings competitive with, and exhibit superior robustness in handling missing data compared to neural autoencoders. These results highlight APCs as a powerful and flexible representation learning method that exploits the probabilistic inference capabilities of PCs, showing promising directions for robust inference, out-of-distribution detection, and knowledge distillation.

MAFeb 26, 2025

Combining Planning and Reinforcement Learning for Solving Relational Multiagent Domains

Nikhilesh Prabhakar, Ranveer Singh, Harsha Kokel et al. · ibm-research

Multiagent Reinforcement Learning (MARL) poses significant challenges due to the exponential growth of state and action spaces and the non-stationary nature of multiagent environments. This results in notable sample inefficiency and hinders generalization across diverse tasks. The complexity is further pronounced in relational settings, where domain knowledge is crucial but often underutilized by existing MARL algorithms. To overcome these hurdles, we propose integrating relational planners as centralized controllers with efficient state abstractions and reinforcement learning. This approach proves to be sample-efficient and facilitates effective task transfer and generalization.

LGOct 19, 2021

Explaining Deep Tractable Probabilistic Models: The sum-product network case

Athresh Karanam, Saurabh Mathur, Predrag Radivojac et al.

We consider the problem of explaining a class of tractable deep probabilistic models, the Sum-Product Networks (SPNs) and present an algorithm ExSPN to generate explanations. To this effect, we define the notion of a context-specific independence tree(CSI-tree) and present an iterative algorithm that converts an SPN to a CSI-tree. The resulting CSI-tree is both interpretable and explainable to the domain expert. We achieve this by extracting the conditional independencies encoded by the SPN and approximating the local context specified by the structure of the SPN. Our extensive empirical evaluations on synthetic, standard, and real-world clinical data sets demonstrate that the CSI-tree exhibits superior explainability.

LGOct 18, 2021

Relational Neural Markov Random Fields

Yuqiao Chen, Sriraam Natarajan, Nicholas Ruozzi

Statistical Relational Learning (SRL) models have attracted significant attention due to their ability to model complex data while handling uncertainty. However, most of these models have been limited to discrete domains due to their limited potential functions. We introduce Relational Neural Markov Random Fields (RN-MRFs) which allow for handling of complex relational hybrid domains. The key advantage of our model is that it makes minimal data distributional assumptions and can seamlessly allow for human knowledge through potentials or relational rules. We propose a maximum pseudolikelihood estimation-based learning algorithm with importance sampling for training the neural potential parameters. Our empirical evaluations across diverse domains such as image processing and relational object mapping, clearly demonstrate its effectiveness against non-neural counterparts.

AIOct 15, 2021

Dynamic probabilistic logic models for effective abstractions in RL

Harsha Kokel, Arjun Manoharan, Sriraam Natarajan et al.

State abstraction enables sample-efficient learning and better task transfer in complex reinforcement learning environments. Recently, we proposed RePReL (Kokel et al. 2021), a hierarchical framework that leverages a relational planner to provide useful state abstractions for learning. We present a brief overview of this framework and the use of a dynamic probabilistic logic model to design these state abstractions. Our experiments show that RePReL not only achieves better performance and efficient learning on the task at hand but also demonstrates better generalization to unseen tasks.

LGMar 19, 2021

Predicting Drug-Drug Interactions from Heterogeneous Data: An Embedding Approach

Devendra Singh Dhami, Siwen Yan, Gautam Kunapuli et al.

Predicting and discovering drug-drug interactions (DDIs) using machine learning has been studied extensively. However, most of the approaches have focused on text data or textual representation of the drug structures. We present the first work that uses multiple data sources such as drug structure images, drug structure string representation and relational representation of drug relationships as the input. To this effect, we exploit the recent advances in deep networks to integrate these varied sources of inputs in predicting DDIs. Our empirical evaluation against several state-of-the-art methods using standalone different data types for drugs clearly demonstrate the efficacy of combining heterogeneous data in predicting DDIs.

LGFeb 20, 2021

Interventional Sum-Product Networks: Causal Inference with Tractable Probabilistic Models

Matej Zečević, Devendra Singh Dhami, Athresh Karanam et al.

While probabilistic models are an important tool for studying causality, doing so suffers from the intractability of inference. As a step towards tractable causal models, we consider the problem of learning interventional distributions using sum-product networks (SPNs) that are over-parameterized by gate functions, e.g., neural networks. Providing an arbitrarily intervened causal graph as input, effectively subsuming Pearl's do-operator, the gate function predicts the parameters of the SPN. The resulting interventional SPNs are motivated and illustrated by a structural causal model themed around personal health. Our empirical evaluation on three benchmark data sets as well as a synthetic health data set clearly demonstrates that interventional SPNs indeed are both expressive in modelling and flexible in adapting to the interventions.

LGFeb 13, 2021

A Statistical Relational Approach to Learning Distance-based GCNs

Devendra Singh Dhami, Siwen Yan, Sriraam Natarajan

We consider the problem of learning distance-based Graph Convolutional Networks (GCNs) for relational data. Specifically, we first embed the original graph into the Euclidean space $\mathbb{R}^m$ using a relational density estimation technique thereby constructing a secondary Euclidean graph. The graph vertices correspond to the target triples and edges denote the Euclidean distances between the target triples. We emphasize the importance of learning the secondary Euclidean graph and the advantages of employing a distance matrix over the typically used adjacency matrix. Our comprehensive empirical evaluation demonstrates the superiority of our approach over $12$ different GCN models, relational embedding techniques and rule learning techniques.

LGDec 16, 2020

Relational Boosted Bandits

Ashutosh Kakadiya, Sriraam Natarajan, Balaraman Ravindran

Contextual bandits algorithms have become essential in real-world user interaction problems in recent years. However, these algorithms rely on context as attribute value representation, which makes them unfeasible for real-world domains like social networks are inherently relational. We propose Relational Boosted Bandits(RB2), acontextual bandits algorithm for relational domains based on (relational) boosted trees. RB2 enables us to learn interpretable and explainable models due to the more descriptive nature of the relational representation. We empirically demonstrate the effectiveness and interpretability of RB2 on tasks such as link prediction, relational classification, and recommendations.

LGJun 10, 2020

Fitted Q-Learning for Relational Domains

Srijita Das, Sriraam Natarajan, Kaushik Roy et al.

We consider the problem of Approximate Dynamic Programming in relational domains. Inspired by the success of fitted Q-learning methods in propositional settings, we develop the first relational fitted Q-learning algorithms by representing the value function and Bellman residuals. When we fit the Q-functions, we show how the two steps of Bellman operator; application and projection steps can be performed using a gradient-boosting technique. Our proposed framework performs reasonably well on standard domains without using domain models and using fewer training trajectories.

AIMar 13, 2020

Knowledge Graph Alignment using String Edit Distance

Navdeep Kaur, Gautam Kunapuli, Sriraam Natarajan

In this work, we propose a novel knowledge graph alignment technique based upon string edit distance that exploits the type information between entities and can find similarity between relations of any arity

AIJan 13, 2020

A Preliminary Approach for Learning Relational Policies for the Management of Critically Ill Children

Michael A. Skinner, Lakshmi Raman, Neel Shah et al.

The increased use of electronic health records has made possible the automated extraction of medical policies from patient records to aid in the development of clinical decision support systems. We adapted a boosted Statistical Relational Learning (SRL) framework to learn probabilistic rules from clinical hospital records for the management of physiologic parameters of children with severe cardiac or respiratory failure who were managed with extracorporeal membrane oxygenation. In this preliminary study, the results were promising. In particular, the algorithm returned logic rules for medical actions that are consistent with medical reasoning.

LGJan 9, 2020

Non-Parametric Learning of Lifted Restricted Boltzmann Machines

Navdeep Kaur, Gautam Kunapuli, Sriraam Natarajan

We consider the problem of discriminatively learning restricted Boltzmann machines in the presence of relational data. Unlike previous approaches that employ a rule learner (for structure learning) and a weight learner (for parameter learning) sequentially, we develop a gradient-boosted approach that performs both simultaneously. Our approach learns a set of weak relational regression trees, whose paths from root to leaf are conjunctive clauses and represent the structure, and whose leaf values represent the parameters. When the learned relational regression trees are transformed into a lifted RBM, its hidden nodes are precisely the conjunctive clauses derived from the relational regression trees. This leads to a more interpretable and explainable model. Our empirical evaluations clearly demonstrate this aspect, while displaying no loss in effectiveness of the learned models.

LGJan 8, 2020

Lifted Hybrid Variational Inference

Yuqiao Chen, Yibo Yang, Sriraam Natarajan et al.

A variety of lifted inference algorithms, which exploit model symmetry to reduce computational cost, have been proposed to render inference tractable in probabilistic relational models. Most existing lifted inference algorithms operate only over discrete domains or continuous domains with restricted potential functions, e.g., Gaussian. We investigate two approximate lifted variational approaches that are applicable to hybrid domains and expressive enough to capture multi-modality. We demonstrate that the proposed variational methods are both scalable and can take advantage of approximate model symmetries, even in the presence of a large amount of continuous evidence. We demonstrate that our approach compares favorably against existing message-passing based approaches in a variety of settings. Finally, we present a sufficient condition for the Bethe approximation to yield a non-trivial estimate over the marginal polytope.

LGJan 2, 2020

Non-Parametric Learning of Gaifman Models

Devendra Singh Dhami, Siwen Yan, Gautam Kunapuli et al.

We consider the problem of structure learning for Gaifman models and learn relational features that can be used to derive feature representations from a knowledge base. These relational features are first-order rules that are then partially grounded and counted over local neighborhoods of a Gaifman model to obtain the feature representations. We propose a method for learning these relational features for a Gaifman model by using relational tree distances. Our empirical evaluation on real data sets demonstrates the superiority of our approach over classical rule-learning.

AIDec 16, 2019

User Friendly Automatic Construction of Background Knowledge: Mode Construction from ER Diagrams

Alexander L. Hayes, Mayukh Das, Phillip Odom et al.

One of the key advantages of Inductive Logic Programming systems is the ability of the domain experts to provide background knowledge as modes that allow for efficient search through the space of hypotheses. However, there is an inherent assumption that this expert should also be an ILP expert to provide effective modes. We relax this assumption by designing a graphical user interface that allows the domain expert to interact with the system using Entity Relationship diagrams. These interactions are used to construct modes for the learning system. We evaluate our algorithm on a probabilistic logic learning system where we demonstrate that the user is able to construct effective background knowledge on par with the expert-encoded knowledge on five data sets.

AIDec 15, 2019

One-Shot Induction of Generalized Logical Concepts via Human Guidance

Mayukh Das, Nandini Ramanan, Janardhan Rao Doppa et al.

We consider the problem of learning generalized first-order representations of concepts from a single example. To address this challenging problem, we augment an inductive logic programming learner with two novel algorithmic contributions. First, we define a distance measure between candidate concept representations that improves the efficiency of search for target concept and generalization. Second, we leverage richer human inputs in the form of advice to improve the sample-efficiency of learning. We prove that the proposed distance measure is semantically valid and use that to derive a PAC bound. Our experimental analysis on diverse concept learning tasks demonstrates both the effectiveness and efficiency of the proposed approach over a first-order concept learner using only examples.

LGNov 14, 2019

Beyond Textual Data: Predicting Drug-Drug Interactions from Molecular Structure Images using Siamese Neural Networks

Devendra Singh Dhami, Siwen Yan, Gautam Kunapuli et al.

Predicting and discovering drug-drug interactions (DDIs) is an important problem and has been studied extensively both from medical and machine learning point of view. Almost all of the machine learning approaches have focused on text data or textual representation of the structural data of drugs. We present the first work that uses drug structure images as the input and utilizes a Siamese convolutional network architecture to predict DDIs.

LGAug 28, 2019

Neural Networks for Relational Data

Navdeep Kaur, Gautam Kunapuli, Saket Joshi et al.

While deep networks have been enormously successful over the last decade, they rely on flat-feature vector representations, which makes them unsuitable for richly structured domains such as those arising in applications like social network analysis. Such domains rely on relational representations to capture complex relationships between entities and their attributes. Thus, we consider the problem of learning neural networks for relational data. We distinguish ourselves from current approaches that rely on expert hand-coded rules by learning relational random-walk-based features to capture local structural interactions and the resulting network architecture. We further exploit parameter tying of the network weights of the resulting relational neural network, where instances of the same type share parameters. Our experimental results across several standard relational data sets demonstrate the effectiveness of the proposed approach over multiple neural net baselines as well as state-of-the-art statistical relational models.

LGMay 31, 2019

Knowledge-augmented Column Networks: Guiding Deep Learning with Advice

Mayukh Das, Devendra Singh Dhami, Yang Yu et al.

Recently, deep models have had considerable success in several tasks, especially with low-level representations. However, effective learning from sparse noisy samples is a major challenge in most deep models, especially in domains with structured representations. Inspired by the proven success of human guided machine learning, we propose Knowledge-augmented Column Networks, a relational deep learning framework that leverages human advice/knowledge to learn better models in presence of sparsity and systematic noise.

LGApr 15, 2019

Human-Guided Learning of Column Networks: Augmenting Deep Learning with Advice

Mayukh Das, Yang Yu, Devendra Singh Dhami et al.

Recently, deep models have been successfully applied in several applications, especially with low-level representations. However, sparse, noisy samples and structured domains (with multiple objects and interactions) are some of the open challenges in most deep models. Column Networks, a deep architecture, can succinctly capture such domain structure and interactions, but may still be prone to sub-optimal learning from sparse and noisy samples. Inspired by the success of human-advice guided learning in AI, especially in data-scarce domains, we propose Knowledge-augmented Column Networks that leverage human advice/knowledge for better learning with noisy/sparse samples. Our experiments demonstrate that our approach leads to either superior overall performance or faster convergence (i.e., both effective and efficient).

LGOct 2, 2018

GLAD: GLocalized Anomaly Detection via Human-in-the-Loop Learning

Md Rakibul Islam, Shubhomoy Das, Janardhan Rao Doppa et al.

Human analysts that use anomaly detection systems in practice want to retain the use of simple and explainable global anomaly detectors. In this paper, we propose a novel human-in-the-loop learning algorithm called GLAD (GLocalized Anomaly Detection) that supports global anomaly detectors. GLAD automatically learns their local relevance to specific data instances using label feedback from human analysts. The key idea is to place a uniform prior on the relevance of each member of the anomaly detection ensemble over the input feature space via a neural network trained on unlabeled instances. Subsequently, weights of the neural network are tuned to adjust the local relevance of each ensemble member using all labeled instances. GLAD also provides explanations which can improve the understanding of end-users about anomalies. Our experiments on synthetic and real-world data show the effectiveness of GLAD in learning the local relevance of ensemble members and discovering anomalies via label feedback.

LGAug 6, 2018

Structure Learning for Relational Logistic Regression: An Ensemble Approach

Nandini Ramanan, Gautam Kunapuli, Tushar Khot et al.

We consider the problem of learning Relational Logistic Regression (RLR). Unlike standard logistic regression, the features of RLRs are first-order formulae with associated weight vectors instead of scalar weights. We turn the problem of learning RLR to learning these vector-weighted formulae and develop a learning algorithm based on the recently successful functional-gradient boosting methods for probabilistic logic models. We derive the functional gradients and show how weights can be learned simultaneously in an efficient manner. Our empirical evaluation on standard and novel data sets demonstrates the superiority of our approach over other methods for learning RLR.

AIApr 19, 2018

Preference-Guided Planning: An Active Elicitation Approach

Mayukh Das, Phillip Odom, Md. Rakibul Islam et al.

Planning with preferences has been employed extensively to quickly generate high-quality plans. However, it may be difficult for the human expert to supply this information without knowledge of the reasoning employed by the planner and the distribution of planning problems. We consider the problem of actively eliciting preferences from a human expert during the planning process. Specifically, we study this problem in the context of the Hierarchical Task Network (HTN) planning framework as it allows easy interaction with the human. Our experimental results on several diverse planning domains show that the preferences gathered using the proposed approach improve the quality and speed of the planner, while reducing the burden on the human expert.

LGOct 9, 2017

Sum-Product Networks for Hybrid Domains

Alejandro Molina, Antonio Vergari, Nicola Di Mauro et al.

While all kinds of mixed data -from personal data, over panel and scientific data, to public and commercial data- are collected and stored, building probabilistic graphical models for these hybrid domains becomes more difficult. Users spend significant amounts of time in identifying the parametric form of the random variables (Gaussian, Poisson, Logit, etc.) involved and learning the mixed models. To make this difficult task easier, we propose the first trainable probabilistic deep architecture for hybrid domains that features tractable queries. It is based on Sum-Product Networks (SPNs) with piecewise polynomial leave distributions together with novel nonparametric decomposition and conditioning steps using the Hirschfeld-Gebelein-Rényi Maximum Correlation Coefficient. This relieves the user from deciding a-priori the parametric form of the random variables but is still expressive enough to effectively approximate any continuous distribution and permits efficient learning and inference. Our empirical evidence shows that the architecture, called Mixed SPNs, can indeed capture complex distributions across a wide range of hybrid domains.

AIJul 4, 2016

Application of Statistical Relational Learning to Hybrid Recommendation Systems

Shuo Yang, Mohammed Korayem, Khalifeh AlJadda et al.

Recommendation systems usually involve exploiting the relations among known features and content that describe items (content-based filtering) or the overlap of similar users who interacted with or rated the target item (collaborative filtering). To combine these two filtering approaches, current model-based hybrid recommendation systems typically require extensive feature engineering to construct a user profile. Statistical Relational Learning (SRL) provides a straightforward way to combine the two approaches. However, due to the large scale of the data used in real world recommendation systems, little research exists on applying SRL models to hybrid recommendation systems, and essentially none of that research has been applied on real big-data-scale systems. In this paper, we proposed a way to adapt the state-of-the-art in SRL learning approaches to construct a real hybrid recommendation system. Furthermore, in order to satisfy a common requirement in recommendation systems (i.e. that false positives are more undesirable and therefore penalized more harshly than false negatives), our approach can also allow tuning the trade-off between the precision and recall of the system in a principled way. Our experimental results demonstrate the efficiency of our proposed approach as well as its improved performance on recommendation precision.

AIJul 1, 2016

Learning Relational Dependency Networks for Relation Extraction

Dileep Viswanathan, Ameet Soni, Jude Shavlik et al.

We consider the task of KBP slot filling -- extracting relation information from newswire documents for knowledge base construction. We present our pipeline, which employs Relational Dependency Networks (RDNs) to learn linguistic patterns for relation extraction. Additionally, we demonstrate how several components such as weak supervision, word2vec features, joint learning and the use of human advice, can be incorporated in this relational framework. We evaluate the different components in the benchmark KBP 2015 task and show that RDNs effectively model a diverse set of features and perform competitively with current state-of-the-art relation extraction.

AIMay 9, 2012

Counting Belief Propagation

Kristian Kersting, Babak Ahmadi, Sriraam Natarajan

A major benefit of graphical models is that most knowledge is captured in the model structure. Many models, however, produce inference problems with a lot of symmetries not reflected in the graphical structure and hence not exploitable by efficient inference techniques such as belief propagation (BP). In this paper, we present a new and simple BP algorithm, called counting BP, that exploits such additional symmetries. Starting from a given factor graph, counting BP first constructs a compressed factor graph of clusternodes and clusterfactors, corresponding to sets of nodes and factors that are indistinguishable given the evidence. Then it runs a modified BP algorithm on the compressed graph that is equivalent to running BP on the original factor graph. Our experiments show that counting BP is applicable to a variety of important AI tasks such as (dynamic) relational models and boolean model counting, and that significant efficiency gains are obtainable, often by orders of magnitude.