AISep 21, 2023
A Comprehensive Review on Financial Explainable AIWei Jie Yeo, Wihan van der Heever, Rui Mao et al.
The success of artificial intelligence (AI), and deep learning models in particular, has led to their widespread adoption across various industries due to their ability to process huge amounts of data and learn complex patterns. However, due to their lack of explainability, there are significant concerns regarding their use in critical sectors, such as finance and healthcare, where decision-making transparency is of paramount importance. In this paper, we provide a comparative survey of methods that aim to improve the explainability of deep learning models within the context of finance. We categorize the collection of explainable AI methods according to their corresponding characteristics, and we review the concerns and challenges of adopting explainable AI methods, together with future directions we deemed appropriate and important.
IRJun 22, 2023
Recent Developments in Recommender Systems: A SurveyYang Li, Kangbo Liu, Ranjan Satapathy et al.
In this technical survey, we comprehensively summarize the latest advancements in the field of recommender systems. The objective of this study is to provide an overview of the current state-of-the-art in the field and highlight the latest trends in the development of recommender systems. The study starts with a comprehensive summary of the main taxonomy of recommender systems, including personalized and group recommender systems, and then delves into the category of knowledge-based recommender systems. In addition, the survey analyzes the robustness, data bias, and fairness issues in recommender systems, summarizing the evaluation metrics used to assess the performance of these systems. Finally, the study provides insights into the latest trends in the development of recommender systems and highlights the new directions for future research in the field.
CLMar 5, 2023
FinXABSA: Explainable Finance through Aspect-Based Sentiment AnalysisKeane Ong, Wihan van der Heever, Ranjan Satapathy et al.
This paper presents a novel approach for explainability in financial analysis by deriving financially-explainable statistical relationships through aspect-based sentiment analysis, Pearson correlation, Granger causality & uncertainty coefficient. The proposed methodology involves constructing an aspect list from financial literature and applying aspect-based sentiment analysis on social media text to compute sentiment scores for each aspect. Pearson correlation is then applied to uncover financially explainable relationships between aspect sentiment scores and stock prices. Findings for derived relationships are made robust by applying Granger causality to determine the forecasting ability of each aspect sentiment score for stock prices. Finally, an added layer of interpretability is added by evaluating uncertainty coefficient scores between aspect sentiment scores and stock prices. This allows us to determine the aspects whose sentiment scores are most statistically significant for stock prices. Relative to other methods, our approach provides a more informative and accurate understanding of the relationship between sentiment analysis and stock prices. Specifically, this methodology enables an interpretation of the statistical relationship between aspect-based sentiment scores and stock prices, which offers explainability to AI-driven financial decision-making.
CYJul 3, 2024
Explainable Natural Language Processing for Corporate Sustainability AnalysisKeane Ong, Rui Mao, Ranjan Satapathy et al.
Sustainability commonly refers to entities, such as individuals, companies, and institutions, having a non-detrimental (or even positive) impact on the environment, society, and the economy. With sustainability becoming a synonym of acceptable and legitimate behaviour, it is being increasingly demanded and regulated. Several frameworks and standards have been proposed to measure the sustainability impact of corporations, including United Nations' sustainable development goals and the recently introduced global sustainability reporting framework, amongst others. However, the concept of corporate sustainability is complex due to the diverse and intricate nature of firm operations (i.e. geography, size, business activities, interlinks with other stakeholders). As a result, corporate sustainability assessments are plagued by subjectivity both within data that reflect corporate sustainability efforts (i.e. corporate sustainability disclosures) and the analysts evaluating them. This subjectivity can be distilled into distinct challenges, such as incompleteness, ambiguity, unreliability and sophistication on the data dimension, as well as limited resources and potential bias on the analyst dimension. Put together, subjectivity hinders effective cost attribution to entities non-compliant with prevailing sustainability expectations, potentially rendering sustainability efforts and its associated regulations futile. To this end, we argue that Explainable Natural Language Processing (XNLP) can significantly enhance corporate sustainability analysis. Specifically, linguistic understanding algorithms (lexical, semantic, syntactic), integrated with XAI capabilities (interpretability, explainability, faithfulness), can bridge gaps in analyst resources and mitigate subjectivity problems within data.
CLFeb 19, 2024Code
How Interpretable are Reasoning Explanations from Prompting Large Language Models?Wei Jie Yeo, Ranjan Satapathy, Rick Siow Mong Goh et al.
Prompt Engineering has garnered significant attention for enhancing the performance of large language models across a multitude of tasks. Techniques such as the Chain-of-Thought not only bolster task performance but also delineate a clear trajectory of reasoning steps, offering a tangible form of explanation for the audience. Prior works on interpretability assess the reasoning chains yielded by Chain-of-Thought solely along a singular axis, namely faithfulness. We present a comprehensive and multifaceted evaluation of interpretability, examining not only faithfulness but also robustness and utility across multiple commonsense reasoning benchmarks. Likewise, our investigation is not confined to a single prompting technique; it expansively covers a multitude of prevalent prompting techniques employed in large language models, thereby ensuring a wide-ranging and exhaustive evaluation. In addition, we introduce a simple interpretability alignment technique, termed Self-Entailment-Alignment Chain-of-thought, that yields more than 70\% improvements across multiple dimensions of interpretability. Code is available at https://github.com/SenticNet/CoT_interpretability
30.5AIMar 23
Beyond Correlation: Refutation-Validated Aspect-Based Sentiment Analysis for Explainable Energy Market ReturnsWihan van der Heever, Keane Ong, Ranjan Satapathy et al.
This paper proposes a refutation-validated framework for aspect-based sentiment analysis in financial markets, addressing the limitations of correlational studies that cannot distinguish genuine associations from spurious ones. Using X data for the energy sector, we test whether aspect-level sentiment signals show robust, refutation-validated relationships with equity returns. Our pipeline combines net-ratio scoring with z-normalization, OLS with Newey West HAC errors, and refutation tests including placebo, random common cause, subset stability, and bootstrap. Across six energy tickers, only a few associations survive all checks, while renewables show aspect and horizon specific responses. While not establishing causality, the framework provides statistically robust, directionally interpretable signals, with limited sample size (six stocks, one quarter) constraining generalizability and framing this work as a methodological proof of concept.
CLDec 14, 2024Code
SusGen-GPT: A Data-Centric LLM for Financial NLP and Sustainability Report GenerationQilong Wu, Xiaoneng Xiang, Hejia Huang et al.
The rapid growth of the financial sector and the rising focus on Environmental, Social, and Governance (ESG) considerations highlight the need for advanced NLP tools. However, open-source LLMs proficient in both finance and ESG domains remain scarce. To address this gap, we introduce SusGen-30K, a category-balanced dataset comprising seven financial NLP tasks and ESG report generation, and propose TCFD-Bench, a benchmark for evaluating sustainability report generation. Leveraging this dataset, we developed SusGen-GPT, a suite of models achieving state-of-the-art performance across six adapted and two off-the-shelf tasks, trailing GPT-4 by only 2% despite using 7-8B parameters compared to GPT-4's 1,700B. Based on this, we propose the SusGen system, integrated with Retrieval-Augmented Generation (RAG), to assist in sustainability report generation. This work demonstrates the efficiency of our approach, advancing research in finance and ESG.
CLMay 29, 2025Code
Understanding Refusal in Language Models with Sparse AutoencodersWei Jie Yeo, Nirmalendu Prakash, Clement Neo et al.
Refusal is a key safety behavior in aligned language models, yet the internal mechanisms driving refusals remain opaque. In this work, we conduct a mechanistic study of refusal in instruction-tuned LLMs using sparse autoencoders to identify latent features that causally mediate refusal behaviors. We apply our method to two open-source chat models and intervene on refusal-related features to assess their influence on generation, validating their behavioral impact across multiple harmful datasets. This enables a fine-grained inspection of how refusal manifests at the activation level and addresses key research questions such as investigating upstream-downstream latent relationship and understanding the mechanisms of adversarial jailbreaking techniques. We also establish the usefulness of refusal features in enhancing generalization for linear probes to out-of-distribution adversarial samples in classification tasks. We open source our code in https://github.com/wj210/refusal_sae.
CVMay 23, 2025Code
Debiasing CLIP: Interpreting and Correcting Bias in Attention HeadsWei Jie Yeo, Rui Mao, Moloud Abdar et al.
Multimodal models like CLIP have gained significant attention due to their remarkable zero-shot performance across various tasks. However, studies have revealed that CLIP can inadvertently learn spurious associations between target variables and confounding factors. To address this, we introduce \textsc{Locate-Then-Correct} (LTC), a contrastive framework that identifies spurious attention heads in Vision Transformers via mechanistic insights and mitigates them through targeted ablation. Furthermore, LTC identifies salient, task-relevant attention heads, enabling the integration of discriminative features through orthogonal projection to improve classification performance. We evaluate LTC on benchmarks with inherent background and gender biases, achieving over a $>50\%$ gain in worst-group accuracy compared to non-training post-hoc baselines. Additionally, we visualize the representation of selected heads and find that the presented interpretation corroborates our contrastive mechanism for identifying both spurious and salient attention heads. Code available at https://github.com/wj210/CLIP_LTC.
CLOct 18, 2024Code
Towards Faithful Natural Language Explanations: A Study Using Activation Patching in Large Language ModelsWei Jie Yeo, Ranjan Satapathy, Erik Cambria
Large Language Models (LLMs) are capable of generating persuasive Natural Language Explanations (NLEs) to justify their answers. However, the faithfulness of these explanations should not be readily trusted at face value. Recent studies have proposed various methods to measure the faithfulness of NLEs, typically by inserting perturbations at the explanation or feature level. We argue that these approaches are neither comprehensive nor correctly designed according to the established definition of faithfulness. Moreover, we highlight the risks of grounding faithfulness findings on out-of-distribution samples. In this work, we leverage a causal mediation technique called activation patching, to measure the faithfulness of an explanation towards supporting the explained answer. Our proposed metric, Causal Faithfulness quantifies the consistency of causal attributions between explanations and the corresponding model outputs as the indicator of faithfulness. We experimented across models varying from 2B to 27B parameters and found that models that underwent alignment tuning tend to produce more faithful and plausible explanations. We find that Causal Faithfulness is a promising improvement over existing faithfulness tests by taking into account the model's internal computations and avoiding out of distribution concerns that could otherwise undermine the validity of faithfulness assessments. We release the code in \url{https://github.com/wj210/Causal-Faithfulness}
CRAug 16, 2025Code
Mitigating Jailbreaks with Intent-Aware LLMsWei Jie Yeo, Ranjan Satapathy, Erik Cambria
Despite extensive safety-tuning, large language models (LLMs) remain vulnerable to jailbreak attacks via adversarially crafted instructions, reflecting a persistent trade-off between safety and task performance. In this work, we propose Intent-FT, a simple and lightweight fine-tuning approach that explicitly trains LLMs to infer the underlying intent of an instruction before responding. By fine-tuning on a targeted set of adversarial instructions, Intent-FT enables LLMs to generalize intent deduction to unseen attacks, thereby substantially improving their robustness. We comprehensively evaluate both parametric and non-parametric attacks across open-source and proprietary models, considering harmfulness from attacks, utility, over-refusal, and impact against white-box threats. Empirically, Intent-FT consistently mitigates all evaluated attack categories, with no single attack exceeding a 50\% success rate -- whereas existing defenses remain only partially effective. Importantly, our method preserves the model's general capabilities and reduces excessive refusals on benign instructions containing superficially harmful keywords. Furthermore, models trained with Intent-FT accurately identify hidden harmful intent in adversarial attacks, and these learned intentions can be effectively transferred to enhance vanilla model defenses. We publicly release our code at https://github.com/wj210/Intent_Jailbreak.
CLFeb 13, 2024
Plausible Extractive Rationalization through Semi-Supervised Entailment SignalWei Jie Yeo, Ranjan Satapathy, Erik Cambria
The increasing use of complex and opaque black box models requires the adoption of interpretable measures, one such option is extractive rationalizing models, which serve as a more interpretable alternative. These models, also known as Explain-Then-Predict models, employ an explainer model to extract rationales and subsequently condition the predictor with the extracted information. Their primary objective is to provide precise and faithful explanations, represented by the extracted rationales. In this paper, we take a semi-supervised approach to optimize for the plausibility of extracted rationales. We adopt a pre-trained natural language inference (NLI) model and further fine-tune it on a small set of supervised rationales ($10\%$). The NLI predictor is leveraged as a source of supervisory signals to the explainer via entailment alignment. We show that, by enforcing the alignment agreement between the explanation and answer in a question-answering task, the performance can be improved without access to ground truth labels. We evaluate our approach on the ERASER dataset and show that our approach achieves comparable results with supervised extractive models and outperforms unsupervised approaches by $> 100\%$.
CLJan 27, 2025
ESGSenticNet: A Neurosymbolic Knowledge Base for Corporate Sustainability AnalysisKeane Ong, Rui Mao, Deeksha Varshney et al.
Evaluating corporate sustainability performance is essential to drive sustainable business practices, amid the need for a more sustainable economy. However, this is hindered by the complexity and volume of corporate sustainability data (i.e. sustainability disclosures), not least by the effectiveness of the NLP tools used to analyse them. To this end, we identify three primary challenges - immateriality, complexity, and subjectivity, that exacerbate the difficulty of extracting insights from sustainability disclosures. To address these issues, we introduce ESGSenticNet, a publicly available knowledge base for sustainability analysis. ESGSenticNet is constructed from a neurosymbolic framework that integrates specialised concept parsing, GPT-4o inference, and semi-supervised label propagation, together with a hierarchical taxonomy. This approach culminates in a structured knowledge base of 44k knowledge triplets - ('halve carbon emission', supports, 'emissions control'), for effective sustainability analysis. Experiments indicate that ESGSenticNet, when deployed as a lexical method, more effectively captures relevant and actionable sustainability information from sustainability disclosures compared to state of the art baselines. Besides capturing a high number of unique ESG topic terms, ESGSenticNet outperforms baselines on the ESG relatedness and ESG action orientation of these terms by 26% and 31% respectively. These metrics describe the extent to which topic terms are related to ESG, and depict an action toward ESG. Moreover, when deployed as a lexical method, ESGSenticNet does not require any training, possessing a key advantage in its simplicity for non-technical stakeholders.
CLSep 7, 2025
Beyond I'm Sorry, I Can't: Dissecting Large Language Model RefusalNirmalendu Prakash, Yeo Wei Jie, Amir Abdullah et al.
Refusal on harmful prompts is a key safety behaviour in instruction-tuned large language models (LLMs), yet the internal causes of this behaviour remain poorly understood. We study two public instruction-tuned models, Gemma-2-2B-IT and LLaMA-3.1-8B-IT, using sparse autoencoders (SAEs) trained on residual-stream activations. Given a harmful prompt, we search the SAE latent space for feature sets whose ablation flips the model from refusal to compliance, demonstrating causal influence and creating a jailbreak. Our search proceeds in three stages: (1) Refusal Direction: find a refusal-mediating direction and collect SAE features near that direction; (2) Greedy Filtering: prune to a minimal set; and (3) Interaction Discovery: fit a factorization machine (FM) that captures nonlinear interactions among the remaining active features and the minimal set. This pipeline yields a broad set of jailbreak-critical features, offering insight into the mechanistic basis of refusal. Moreover, we find evidence of redundant features that remain dormant unless earlier features are suppressed. Our findings highlight the potential for fine-grained auditing and targeted intervention in safety behaviours by manipulating the interpretable latent space.
CLJun 17, 2024
Self-training Large Language Models through Knowledge DetectionWei Jie Yeo, Teddy Ferdinan, Przemyslaw Kazienko et al.
Large language models (LLMs) often necessitate extensive labeled datasets and training compute to achieve impressive performance across downstream tasks. This paper explores a self-training paradigm, where the LLM autonomously curates its own labels and selectively trains on unknown data samples identified through a reference-free consistency method. Empirical evaluations demonstrate significant improvements in reducing hallucination in generation across multiple subjects. Furthermore, the selective training framework mitigates catastrophic forgetting in out-of-distribution benchmarks, addressing a critical limitation in training LLMs. Our findings suggest that such an approach can substantially reduce the dependency on large labeled datasets, paving the way for more scalable and cost-effective language model training.
CLJan 14, 2022
Polarity and Subjectivity Detection with Multitask Learning and BERT EmbeddingRanjan Satapathy, Shweta Pardeshi, Erik Cambria
Multitask learning often helps improve the performance of related tasks as these often have inter-dependence on each other and perform better when solved in a joint framework. In this paper, we present a deep multitask learning framework that jointly performs polarity and subjective detection. We propose an attention-based multitask model for predicting polarity and subjectivity. The input sentences are transformed into vectors using pre-trained BERT and Glove embeddings, and the results depict that BERT embedding based model works better than the Glove based model. We compare our approach with state-of-the-art models in both subjective and polarity classification single-task and multitask frameworks. The proposed approach reports baseline performances for both polarity detection and subjectivity detection.
HCMay 22, 2019
Can a Humanoid Robot be part of the Organizational Workforce? A User Study Leveraging Sentiment AnalysisNidhi Mishra, Manoj Ramanathan, Ranjan Satapathy et al.
Hiring robots for the workplaces is a challenging task as robots have to cater to customer demands, follow organizational protocols and behave with social etiquette. In this study, we propose to have a humanoid social robot, Nadine, as a customer service agent in an open social work environment. The objective of this study is to analyze the effects of humanoid robots on customers at work environment, and see if it can handle social scenarios. We propose to evaluate these objectives through two modes, namely, survey questionnaire and customer feedback. We also propose a novel approach to analyze customer feedback data (text) using sentic computing methods. Specifically, we employ aspect extraction and sentiment analysis to analyze the data. From our framework, we detect sentiment associated to the aspects that mainly concerned the customers during their interaction. This allows us to understand customers expectations and current limitations of robots as employees.
CLApr 24, 2019
PhonSenticNet: A Cognitive Approach to Microtext Normalization for Concept-Level Sentiment AnalysisRanjan Satapathy, Aalind Singh, Erik Cambria
With the current upsurge in the usage of social media platforms, the trend of using short text (microtext) in place of standard words has seen a significant rise. The usage of microtext poses a considerable performance issue in concept-level sentiment analysis, since models are trained on standard words. This paper discusses the impact of coupling sub-symbolic (phonetics) with symbolic (machine learning) Artificial Intelligence to transform the out-of-vocabulary concepts into their standard in-vocabulary form. The phonetic distance is calculated using the Sorensen similarity algorithm. The phonetically similar invocabulary concepts thus obtained are then used to compute the correct polarity value, which was previously being miscalculated because of the presence of microtext. Our proposed framework increases the accuracy of polarity detection by 6% as compared to the earlier model. This also validates the fact that microtext normalization is a necessary pre-requisite for the sentiment analysis task.