Eugene Santos

AI
h-index1
9papers
174citations
Novelty47%
AI Score37

9 Papers

AIMar 8, 2023
Learning the Finer Things: Bayesian Structure Learning at the Instantiation Level

Chase Yakaboski, Eugene Santos

Successful machine learning methods require a trade-off between memorization and generalization. Too much memorization and the model cannot generalize to unobserved examples. Too much over-generalization and we risk under-fitting the data. While we commonly measure their performance through cross validation and accuracy metrics, how should these algorithms cope in domains that are extremely under-determined where accuracy is always unsatisfactory? We present a novel probabilistic graphical model structure learning approach that can learn, generalize and explain in these elusive domains by operating at the random variable instantiation level. Using Minimum Description Length (MDL) analysis, we propose a new decomposition of the learning problem over all training exemplars, fusing together minimal entropy inferences to construct a final knowledge base. By leveraging Bayesian Knowledge Bases (BKBs), a framework that operates at the instantiation level and inherently subsumes Bayesian Networks (BNs), we develop both a theoretical MDL score and associated structure learning algorithm that demonstrates significant improvements over learned BNs on 40 benchmark datasets. Further, our algorithm incorporates recent off-the-shelf DAG learning techniques enabling tractable results even on large problems. We then demonstrate the utility of our approach in a significantly under-determined domain by learning gene regulatory networks on breast cancer gene mutational data available from The Cancer Genome Atlas (TCGA).

AIOct 28, 2023
AI for Open Science: A Multi-Agent Perspective for Ethically Translating Data to Knowledge

Chase Yakaboski, Gregory Hyde, Clement Nyanhongo et al.

AI for Science (AI4Science), particularly in the form of self-driving labs, has the potential to sideline human involvement and hinder scientific discovery within the broader community. While prior research has focused on ensuring the responsible deployment of AI applications, enhancing security, and ensuring interpretability, we also propose that promoting openness in AI4Science discoveries should be carefully considered. In this paper, we introduce the concept of AI for Open Science (AI4OS) as a multi-agent extension of AI4Science with the core principle of maximizing open knowledge translation throughout the scientific enterprise rather than a single organizational unit. We use the established principles of Knowledge Discovery and Data Mining (KDD) to formalize a language around AI4OS. We then discuss three principle stages of knowledge translation embedded in AI4Science systems and detail specific points where openness can be applied to yield an AI4OS alternative. Lastly, we formulate a theoretical metric to assess AI4OS with a supporting ethical argument highlighting its importance. Our goal is that by drawing attention to AI4OS we can ensure the natural consequence of AI4Science (e.g., self-driving labs) is a benefit not only for its developers but for society as a whole.

AISep 27, 2025
Dynamic Trust Calibration Using Contextual Bandits

Bruno M. Henrique, Eugene Santos

Trust calibration between humans and Artificial Intelligence (AI) is crucial for optimal decision-making in collaborative settings. Excessive trust can lead users to accept AI-generated outputs without question, overlooking critical flaws, while insufficient trust may result in disregarding valuable insights from AI systems, hindering performance. Despite its importance, there is currently no definitive and objective method for measuring trust calibration between humans and AI. Current approaches lack standardization and consistent metrics that can be broadly applied across various contexts, and they don't distinguish between the formation of opinions and subsequent human decisions. In this work, we propose a novel and objective method for dynamic trust calibration, introducing a standardized trust calibration measure and an indicator. By utilizing Contextual Bandits-an adaptive algorithm that incorporates context into decision-making-our indicator dynamically assesses when to trust AI contributions based on learned contextual information. We evaluate this indicator across three diverse datasets, demonstrating that effective trust calibration results in significant improvements in decision-making performance, as evidenced by 10 to 38% increase in reward metrics. These findings not only enhance theoretical understanding but also provide practical guidance for developing more trustworthy AI systems supporting decisions in critical domains, for example, disease diagnoses and criminal justice.

LGApr 15, 2025
Reward Distance Comparisons Under Transition Sparsity

Clement Nyanhongo, Bruno Miranda Henrique, Eugene Santos

Reward comparisons are vital for evaluating differences in agent behaviors induced by a set of reward functions. Most conventional techniques utilize the input reward functions to learn optimized policies, which are then used to compare agent behaviors. However, learning these policies can be computationally expensive and can also raise safety concerns. Direct reward comparison techniques obviate policy learning but suffer from transition sparsity, where only a small subset of transitions are sampled due to data collection challenges and feasibility constraints. Existing state-of-the-art direct reward comparison methods are ill-suited for these sparse conditions since they require high transition coverage, where the majority of transitions from a given coverage distribution are sampled. When this requirement is not satisfied, a distribution mismatch between sampled and expected transitions can occur, leading to significant errors. This paper introduces the Sparsity Resilient Reward Distance (SRRD) pseudometric, designed to eliminate the need for high transition coverage by accommodating diverse sample distributions, which are common under transition sparsity. We provide theoretical justification for SRRD's robustness and conduct experiments to demonstrate its practical efficacy across multiple domains.

LGJan 20, 2024
Detecting Hidden Triggers: Mapping Non-Markov Reward Functions to Markov

Gregory Hyde, Eugene Santos

Many Reinforcement Learning algorithms assume a Markov reward function to guarantee optimality. However, not all reward functions are Markov. This paper proposes a framework for mapping non-Markov reward functions into equivalent Markov ones by learning specialized reward automata, Reward Machines. Unlike the general practice of learning Reward Machines, we do not require a set of high-level propositional symbols from which to learn. Rather, we learn hidden triggers, directly from data, that construct them. We demonstrate the importance of learning Reward Machines over their Deterministic Finite-State Automata counterparts given their ability to model reward dependencies. We formalize this distinction in our learning objective. Our mapping process is constructed as an Integer Linear Programming problem. We prove that our mappings form a suitable proxy for maximizing reward expectations. We empirically validate our approach by learning black-box, non-Markov reward functions in the Officeworld domain. Additionally, we demonstrate the effectiveness of learning reward dependencies in a new domain, Breakfastworld.

AIMar 20, 2013
On the Generation of Alternative Explanations with Implications for Belief Revision

Eugene Santos

In general, the best explanation for a given observation makes no promises on how good it is with respect to other alternative explanations. A major deficiency of message-passing schemes for belief revision in Bayesian networks is their inability to generate alternatives beyond the second best. In this paper, we present a general approach based on linear constraint systems that naturally generates alternative explanations in an orderly and highly efficient manner. This approach is then applied to cost-based abduction problems as well as belief revision in Bayesian net works.

AIFeb 27, 2013
Belief Updating by Enumerating High-Probability Independence-Based Assignments

Eugene Santos, Solomon Eyal Shimony

Independence-based (IB) assignments to Bayesian belief networks were originally proposed as abductive explanations. IB assignments assign fewer variables in abductive explanations than do schemes assigning values to all evidentially supported variables. We use IB assignments to approximate marginal probabilities in Bayesian belief networks. Recent work in belief updating for Bayes networks attempts to approximate posterior probabilities by finding a small number of the highest probability complete (or perhaps evidentially supported) assignments. Under certain assumptions, the probability mass in the union of these assignments is sufficient to obtain a good approximation. Such methods are especially useful for highly-connected networks, where the maximum clique size or the cutset size make the standard algorithms intractable. Since IB assignments contain fewer assigned variables, the probability mass in each assignment is greater than in the respective complete assignment. Thus, fewer IB assignments are sufficient, and a good approximation can be obtained more efficiently. IB assignments can be used for efficiently approximating posterior node probabilities even in cases which do not obey the rather strict skewness assumptions used in previous research. Two algorithms for finding the high probability IB assignments are suggested: one by doing a best-first heuristic search, and another by special-purpose integer linear programming. Experimental results show that this approach is feasible for highly connected belief networks.

AIFeb 13, 2013
Sample-and-Accumulate Algorithms for Belief Updating in Bayes Networks

Eugene Santos, Solomon Eyal Shimony, Edward Williams

Belief updating in Bayes nets, a well known computationally hard problem, has recently been approximated by several deterministic algorithms, and by various randomized approximation algorithms. Deterministic algorithms usually provide probability bounds, but have an exponential runtime. Some randomized schemes have a polynomial runtime, but provide only probability estimates. We present randomized algorithms that enumerate high-probability partial instantiations, resulting in probability bounds. Some of these algorithms are also sampling algorithms. Specifically, we introduce and evaluate a variant of backward sampling, both as a sampling algorithm and as a randomized enumeration algorithm. We also relax the implicit assumption used by both sampling and accumulation algorithms, that query nodes must be instantiated in all the samples.

AIFeb 6, 2013
Cost-Sharing in Bayesian Knowledge Bases

Solomon Eyal Shimony, Carmel Domshlak, Eugene Santos

Bayesian knowledge bases (BKBs) are a generalization of Bayes networks and weighted proof graphs (WAODAGs), that allow cycles in the causal graph. Reasoning in BKBs requires finding the most probable inferences consistent with the evidence. The cost-sharing heuristic for finding least-cost explanations in WAODAGs was presented and shown to be effective by Charniak and Husain. However, the cycles in BKBs would make the definition of cost-sharing cyclic as well, if applied directly to BKBs. By treating the defining equations of cost-sharing as a system of equations, one can properly define an admissible cost-sharing heuristic for BKBs. Empirical evaluation shows that cost-sharing improves performance significantly when applied to BKBs.