Huaduo Wang

LG
h-index10
13papers
108citations
Novelty47%
AI Score38

13 Papers

CLJul 26, 2024
A Reliable Common-Sense Reasoning Socialbot Built Using LLMs and Goal-Directed ASP

Yankai Zeng, Abhiramon Rajashekharan, Kinjal Basu et al.

The development of large language models (LLMs), such as GPT, has enabled the construction of several socialbots, like ChatGPT, that are receiving a lot of attention for their ability to simulate a human conversation. However, the conversation is not guided by a goal and is hard to control. In addition, because LLMs rely more on pattern recognition than deductive reasoning, they can give confusing answers and have difficulty integrating multiple topics into a cohesive response. These limitations often lead the LLM to deviate from the main topic to keep the conversation interesting. We propose AutoCompanion, a socialbot that uses an LLM model to translate natural language into predicates (and vice versa) and employs commonsense reasoning based on Answer Set Programming (ASP) to hold a social conversation with a human. In particular, we rely on s(CASP), a goal-directed implementation of ASP as the backend. This paper presents the framework design and how an LLM is used to parse user messages and generate a response from the s(CASP) engine output. To validate our proposal, we describe (real) conversations in which the chatbot's goal is to keep the user entertained by talking about movies and books, and s(CASP) ensures (i) correctness of answers, (ii) coherence (and precision) during the conversation, which it dynamically regulates to achieve its specific purpose, and (iii) no deviation from the main topic.

LGAug 16, 2022
FOLD-SE: An Efficient Rule-based Machine Learning Algorithm with Scalable Explainability

Huaduo Wang, Gopal Gupta

We present FOLD-SE, an efficient, explainable machine learning algorithm for classification tasks given tabular data containing numerical and categorical values. FOLD-SE generates a set of default rules-essentially a stratified normal logic program-as an (explainable) trained model. Explainability provided by FOLD-SE is scalable, meaning that regardless of the size of the dataset, the number of learned rules and learned literals stay quite small while good accuracy in classification is maintained. A model with smaller number of rules and literals is easier to understand for human beings. FOLD-SE is competitive with state-of-the-art machine learning algorithms such as XGBoost and Multi-Layer Perceptrons (MLP) wrt accuracy of prediction. However, unlike XGBoost and MLP, the FOLD-SE algorithm is explainable. The FOLD-SE algorithm builds upon our earlier work on developing the explainable FOLD-R++ machine learning algorithm for binary classification and inherits all of its positive features. Thus, pre-processing of the dataset, using techniques such as one-hot encoding, is not needed. Like FOLD-R++, FOLD-SE uses prefix sum to speed up computations resulting in FOLD-SE being an order of magnitude faster than XGBoost and MLP in execution speed. The FOLD-SE algorithm outperforms FOLD-R++ as well as other rule-learning algorithms such as RIPPER in efficiency, performance and scalability, especially for large datasets. A major reason for scalable explainability of FOLD-SE is the use of a literal selection heuristics based on Gini Impurity, as opposed to Information Gain used in FOLD-R++. A multi-category classification version of FOLD-SE is also presented.

LGJan 30, 2023
NeSyFOLD: Neurosymbolic Framework for Interpretable Image Classification

Parth Padalkar, Huaduo Wang, Gopal Gupta

Deep learning models such as CNNs have surpassed human performance in computer vision tasks such as image classification. However, despite their sophistication, these models lack interpretability which can lead to biased outcomes reflecting existing prejudices in the data. We aim to make predictions made by a CNN interpretable. Hence, we present a novel framework called NeSyFOLD to create a neurosymbolic (NeSy) model for image classification tasks. The model is a CNN with all layers following the last convolutional layer replaced by a stratified answer set program (ASP). A rule-based machine learning algorithm called FOLD-SE-M is used to derive the stratified answer set program from binarized filter activations of the last convolutional layer. The answer set program can be viewed as a rule-set, wherein the truth value of each predicate depends on the activation of the corresponding kernel in the CNN. The rule-set serves as a global explanation for the model and is interpretable. A justification for the predictions made by the NeSy model can be obtained using an ASP interpreter. We also use our NeSyFOLD framework with a CNN that is trained using a sparse kernel learning technique called Elite BackProp (EBP). This leads to a significant reduction in rule-set size without compromising accuracy or fidelity thus improving scalability of the NeSy model and interpretability of its rule-set. Evaluation is done on datasets with varied complexity and sizes. To make the rule-set more intuitive to understand, we propose a novel algorithm for labelling each kernel's corresponding predicate in the rule-set with the semantic concept(s) it learns. We evaluate the performance of our "semantic labelling algorithm" to quantify the efficacy of the semantic labelling for both the NeSy model and the NeSy-EBP model.

LGJun 15, 2022
FOLD-TR: A Scalable and Efficient Inductive Learning Algorithm for Learning To Rank

Huaduo Wang, Gopal Gupta

FOLD-R++ is a new inductive learning algorithm for binary classification tasks. It generates an (explainable) normal logic program for mixed type (numerical and categorical) data. We present a customized FOLD-R++ algorithm with the ranking framework, called FOLD-TR, that aims to rank new items following the ranking pattern in the training data. Like FOLD-R++, the FOLD-TR algorithm is able to handle mixed-type data directly and provide native justification to explain the comparison between a pair of items.

LGSep 14, 2025
Comparative Analysis of FOLD-SE vs. FOLD-R++ in Binary Classification and XGBoost in Multi-Category Classification

Akshay Murthy, Shawn Sebastian, Manil Shangle et al.

Recently, the demand for Machine Learning (ML) models that can balance accuracy, efficiency, and interpreability has grown significantly. Traditionally, there has been a tradeoff between accuracy and explainability in predictive models, with models such as Neural Networks achieving high accuracy on complex datasets while sacrificing internal transparency. As such, new rule-based algorithms such as FOLD-SE have been developed that provide tangible justification for predictions in the form of interpretable rule sets. The primary objective of this study was to compare FOLD-SE and FOLD-R++, both rule-based classifiers, in binary classification and evaluate how FOLD-SE performs against XGBoost, a widely used ensemble classifier, when applied to multi-category classification. We hypothesized that because FOLD-SE can generate a condensed rule set in a more explainable manner, it would lose upwards of an average of 3 percent in accuracy and F1 score when compared with XGBoost and FOLD-R++ in multiclass and binary classification, respectively. The research used data collections for classification, with accuracy, F1 scores, and processing time as the primary performance measures. Outcomes show that FOLD-SE is superior to FOLD-R++ in terms of binary classification by offering fewer rules but losing a minor percentage of accuracy and efficiency in processing time; in tasks that involve multi-category classifications, FOLD-SE is more precise and far more efficient compared to XGBoost, in addition to generating a comprehensible rule set. The results point out that FOLD-SE is a better choice for both binary tasks and classifications with multiple categories. Therefore, these results demonstrate that rule-based approaches like FOLD-SE can bridge the gap between explainability and performance, highlighting their potential as viable alternatives to black-box models in diverse classification tasks.

AIJun 15, 2025
Building Trustworthy AI by Addressing its 16+2 Desiderata with Goal-Directed Commonsense Reasoning

Alexis R. Tudor, Yankai Zeng, Huaduo Wang et al.

Current advances in AI and its applicability have highlighted the need to ensure its trustworthiness for legal, ethical, and even commercial reasons. Sub-symbolic machine learning algorithms, such as the LLMs, simulate reasoning but hallucinate and their decisions cannot be explained or audited (crucial aspects for trustworthiness). On the other hand, rule-based reasoners, such as Cyc, are able to provide the chain of reasoning steps but are complex and use a large number of reasoners. We propose a middle ground using s(CASP), a goal-directed constraint-based answer set programming reasoner that employs a small number of mechanisms to emulate reliable and explainable human-style commonsense reasoning. In this paper, we explain how s(CASP) supports the 16 desiderata for trustworthy AI introduced by Doug Lenat and Gary Marcus (2023), and two additional ones: inconsistency detection and the assumption of alternative worlds. To illustrate the feasibility and synergies of s(CASP), we present a range of diverse applications, including a conversational chatbot and a virtually embodied reasoner.

LGFeb 14, 2022
FOLD-RM: A Scalable, Efficient, and Explainable Inductive Learning Algorithm for Multi-Category Classification of Mixed Data

Huaduo Wang, Farhad Shakerin, Gopal Gupta

FOLD-RM is an automated inductive learning algorithm for learning default rules for mixed (numerical and categorical) data. It generates an (explainable) answer set programming (ASP) rule set for multi-category classification tasks while maintaining efficiency and scalability. The FOLD-RM algorithm is competitive in performance with the widely-used, state-of-the-art algorithms such as XGBoost and multi-layer perceptrons (MLPs), however, unlike these algorithms, the FOLD-RM algorithm produces an explainable model. FOLD-RM outperforms XGBoost on some datasets, particularly large ones. FOLD-RM also provides human-friendly explanations for predictions.

AIOct 17, 2021
AUTO-DISCERN: Autonomous Driving Using Common Sense Reasoning

Suraj Kothawade, Vinaya Khandelwal, Kinjal Basu et al.

Driving an automobile involves the tasks of observing surroundings, then making a driving decision based on these observations (steer, brake, coast, etc.). In autonomous driving, all these tasks have to be automated. Autonomous driving technology thus far has relied primarily on machine learning techniques. We argue that appropriate technology should be used for the appropriate task. That is, while machine learning technology is good for observing and automatically understanding the surroundings of an automobile, driving decisions are better automated via commonsense reasoning rather than machine learning. In this paper, we discuss (i) how commonsense reasoning can be automated using answer set programming (ASP) and the goal-directed s(CASP) ASP system, and (ii) develop the AUTO-DISCERN system using this technology for automating decision-making in driving. The goal of our research, described in this paper, is to develop an autonomous driving system that works by simulating the mind of a human driver. Since driving decisions are based on human-style reasoning, they are explainable, their ethics can be ensured, and they will always be correct, provided the system modeling and system inputs are correct.

LGOct 15, 2021
FOLD-R++: A Scalable Toolset for Automated Inductive Learning of Default Theories from Mixed Data

Huaduo Wang, Gopal Gupta

FOLD-R is an automated inductive learning algorithm for learning default rules for mixed (numerical and categorical) data. It generates an (explainable) answer set programming (ASP) rule set for classification tasks. We present an improved FOLD-R algorithm, called FOLD-R++, that significantly increases the efficiency and scalability of FOLD-R by orders of magnitude. FOLD-R++ improves upon FOLD-R without compromising or losing information in the input training data during the encoding or feature selection phase. The FOLD-R++ algorithm is competitive in performance with the widely-used XGBoost algorithm, however, unlike XGBoost, the FOLD-R++ algorithm produces an explainable model. FOLD-R++ is also competitive in performance with the RIPPER system, however, on large datasets FOLD-R++ outperforms RIPPER. We also create a powerful tool-set by combining FOLD-R++ with s(CASP)-a goal-directed ASP execution engine-to make predictions on new data samples using the answer set program generated by FOLD-R++. The s(CASP) system also produces a justification for the prediction. Experiments presented in this paper show that our improved FOLD-R++ algorithm is a significant improvement over the original design and that the s(CASP) system can make predictions in an efficient manner as well.

AIOct 11, 2021
CASPR: A Commonsense Reasoning-based Conversational Socialbot

Kinjal Basu, Huaduo Wang, Nancy Dominguez et al.

We report on the design and development of the CASPR system, a socialbot designed to compete in the Amazon Alexa Socialbot Challenge 4. CASPR's distinguishing characteristic is that it will use automated commonsense reasoning to truly "understand" dialogs, allowing it to converse like a human. Three main requirements of a socialbot are that it should be able to "understand" users' utterances, possess a strategy for holding a conversation, and be able to learn new knowledge. We developed techniques such as conversational knowledge template (CKT) to approximate commonsense reasoning needed to hold a conversation on specific topics. We present the philosophy behind CASPR's design as well as details of its implementation. We also report on CASPR's performance as well as discuss lessons learned.

AISep 26, 2021
A Clustering and Demotion Based Algorithm for Inductive Learning of Default Theories

Huaduo Wang, Farhad Shakerin, Gopal Gupta

We present a clustering- and demotion-based algorithm called Kmeans-FOLD to induce nonmonotonic logic programs from positive and negative examples. Our algorithm improves upon-and is inspired by-the FOLD algorithm. The FOLD algorithm itself is an improvement over the FOIL algorithm. Our algorithm generates a more concise logic program compared to the FOLD algorithm. Our algorithm uses the K-means based clustering method to cluster the input positive samples before applying the FOLD algorithm. Positive examples that are covered by the partially learned program in intermediate steps are not discarded as in the FOLD algorithm, rather they are demoted, i.e., their weights are reduced in subsequent iterations of the algorithm. Our experiments on the UCI dataset show that a combination of K-Means clustering and our demotion strategy produces significant improvement for datasets with more than one cluster of positive examples. The resulting induced program is also more concise and therefore easier to understand compared to the FOLD and ALEPH systems, two state of the art inductive logic programming (ILP) systems.

LOSep 17, 2021
DiscASP: A Graph-based ASP System for Finding Relevant Consistent Concepts with Applications to Conversational Socialbots

Fang Li, Huaduo Wang, Kinjal Basu et al.

We consider the problem of finding relevant consistent concepts in a conversational AI system, particularly, for realizing a conversational socialbot. Commonsense knowledge about various topics can be represented as an answer set program. However, to advance the conversation, we need to solve the problem of finding relevant consistent concepts, i.e., find consistent knowledge in the "neighborhood" of the current topic being discussed that can be used to advance the conversation. Traditional ASP solvers will generate the whole answer set which is stripped of all the associations between the various atoms (concepts) and thus cannot be used to find relevant consistent concepts. Similarly, goal-directed implementations of ASP will only find concepts directly relevant to a query. We present the DiscASP system that will find the partial consistent model that is relevant to a given topic in a manner similar to how a human will find it. DiscASP is based on a novel graph-based algorithm for finding stable models of an answer set program. We present the DiscASP algorithm, its implementation, and its application to developing a conversational socialbot.

AIApr 2, 2021
grASP: A Graph Based ASP-Solver and Justification System

Fang Li, Huaduo Wang, Gopal Gupta

Answer set programming (ASP) is a popular nonmonotonic-logic based paradigm for knowledge representation and solving combinatorial problems. Computing the answer set of an ASP program is NP-hard in general, and researchers have been investing significant effort to speed it up. The majority of current ASP solvers employ SAT solver-like technology to find these answer sets. As a result, justification for why a literal is in the answer set is hard to produce. There are dependency graph based approaches to find answer sets, but due to the representational limitations of dependency graphs, such approaches are limited. We propose a novel dependency graph-based approach for finding answer sets in which conjunction of goals is explicitly represented as a node which allows arbitrary answer set programs to be uniformly represented. Our representation preserves causal relationships allowing for justification for each literal in the answer set to be elegantly found. Performance results from an implementation are also reported. Our work paves the way for computing answer sets without grounding a program.