Catarina Moreira

AI
h-index7
43papers
1,029citations
Novelty37%
AI Score45

43 Papers

IVFeb 26, 2023Code
MDF-Net for abnormality detection by fusing X-rays with clinical data

Chihcheng Hsieh, Isabel Blanco Nobre, Sandra Costa Sousa et al.

This study investigates the effects of including patients' clinical information on the performance of deep learning (DL) classifiers for disease location in chest X-ray images. Although current classifiers achieve high performance using chest X-ray images alone, our interviews with radiologists indicate that clinical data is highly informative and essential for interpreting images and making proper diagnoses. In this work, we propose a novel architecture consisting of two fusion methods that enable the model to simultaneously process patients' clinical data (structured data) and chest X-rays (image data). Since these data modalities are in different dimensional spaces, we propose a spatial arrangement strategy, spatialization, to facilitate the multimodal learning process in a Mask R-CNN model. We performed an extensive experimental evaluation using MIMIC-Eye, a dataset comprising modalities: MIMIC-CXR (chest X-ray images), MIMIC IV-ED (patients' clinical data), and REFLACX (annotations of disease locations in chest X-rays). Results show that incorporating patients' clinical data in a DL model together with the proposed fusion methods improves the disease localization in chest X-rays by 12\% in terms of Average Precision compared to a standard Mask R-CNN using only chest X-rays. Further ablation studies also emphasize the importance of multimodal DL architectures and the incorporation of patients' clinical data in disease localization. The architecture proposed in this work is publicly available to promote the scientific reproducibility of our study (https://github.com/ChihchengHsieh/multimodal-abnormalities-detection)

LGJul 16, 2024
Investigating Imperceptibility of Adversarial Attacks on Tabular Data: An Empirical Analysis

Zhipeng He, Chun Ouyang, Laith Alzubaidi et al.

Adversarial attacks are a potential threat to machine learning models by causing incorrect predictions through imperceptible perturbations to the input data. While these attacks have been extensively studied in unstructured data like images, applying them to tabular data, poses new challenges. These challenges arise from the inherent heterogeneity and complex feature interdependencies in tabular data, which differ from the image data. To account for this distinction, it is necessary to establish tailored imperceptibility criteria specific to tabular data. However, there is currently a lack of standardised metrics for assessing the imperceptibility of adversarial attacks on tabular data. To address this gap, we propose a set of key properties and corresponding metrics designed to comprehensively characterise imperceptible adversarial attacks on tabular data. These are: proximity to the original input, sparsity of altered features, deviation from the original data distribution, sensitivity in perturbing features with narrow distribution, immutability of certain features that should remain unchanged, feasibility of specific feature values that should not go beyond valid practical ranges, and feature interdependencies capturing complex relationships between data attributes. We evaluate the imperceptibility of five adversarial attacks, including both bounded attacks and unbounded attacks, on tabular data using the proposed imperceptibility metrics. The results reveal a trade-off between the imperceptibility and effectiveness of these attacks. The study also identifies limitations in current attack algorithms, offering insights that can guide future research in the area. The findings gained from this empirical analysis provide valuable direction for enhancing the design of adversarial attack algorithms, thereby advancing adversarial machine learning on tabular data.

HCFeb 6, 2023
Development of an Immersive Virtual Colonoscopy Viewer for Colon Growths Diagnosis

João Serras, Anderson Maciel, Soraia Paulo et al.

Desktop-based virtual colonoscopy has been proven to be an asset in the identification of colon anomalies. The process is accurate, although time-consuming. The use of immersive interfaces for virtual colonoscopy is incipient and not yet understood. In this work, we present a new design exploring elements of the VR paradigm to make the immersive analysis more efficient while still effective. We also plan the conduction of experiments with experts to assess the multi-factor influences of coverage, duration, and diagnostic accuracy.

LGMar 4, 2022
Benchmarking Instance-Centric Counterfactual Algorithms for XAI: From White Box to Black Box

Catarina Moreira, Yu-Liang Chou, Chihcheng Hsieh et al.

This study investigates the impact of machine learning models on the generation of counterfactual explanations by conducting a benchmark evaluation over three different types of models: a decision tree (fully transparent, interpretable, white-box model), a random forest (semi-interpretable, grey-box model), and a neural network (fully opaque, black-box model). We tested the counterfactual generation process using four algorithms (DiCE, WatcherCF, prototype, and GrowingSpheresCF) in the literature in 25 different datasets. Our findings indicate that: (1) Different machine learning models have little impact on the generation of counterfactual explanations; (2) Counterfactual algorithms based uniquely on proximity loss functions are not actionable and will not provide meaningful explanations; (3) One cannot have meaningful evaluation results without guaranteeing plausibility in the counterfactual generation. Algorithms that do not consider plausibility in their internal mechanisms will lead to biased and unreliable conclusions if evaluated with the current state-of-the-art metrics; (4) A counterfactual inspection analysis is strongly recommended to ensure a robust examination of counterfactual explanations and the potential identification of biases.

HCMar 3, 2022
Improving X-ray Diagnostics through Eye-Tracking and XR

Catarina Moreira, Isabel Blanco Nobre, Sandra Costa Sousa et al.

There is a growing need to assist radiologists in performing X-ray readings and diagnoses fast, comfortably, and effectively. As radiologists strive to maximize productivity, it is essential to consider the impact of reading rooms in interpreting complex examinations and ensure that higher volume and reporting speeds do not compromise patient outcomes. Virtual Reality (VR) is a disruptive technology for clinical practice in assessing X-ray images. We argue that conjugating eye-tracking with VR devices and Machine Learning may overcome obstacles posed by inadequate ergonomic postures and poor room conditions that often cause erroneous diagnostics when professionals examine digital images.

CYSep 30, 2024
Developing Guidelines for Functionally-Grounded Evaluation of Explainable Artificial Intelligence using Tabular Data

Mythreyi Velmurugan, Chun Ouyang, Yue Xu et al.

Explainable Artificial Intelligence (XAI) techniques are used to provide transparency to complex, opaque predictive models. However, these techniques are often designed for image and text data, and it is unclear how fit-for-purpose they are when applied to tabular data. As XAI techniques are rarely evaluated in settings with tabular data, the applicability of existing evaluation criteria and methods are also unclear and needs (re-)examination. For example, some works suggest that evaluation methods may unduly influence the evaluation results when using tabular data. This lack of clarity on evaluation procedures can lead to reduced transparency and ineffective use of XAI techniques in real world settings. In this study, we examine literature on XAI evaluation to derive guidelines on functionally-grounded assessment of local, post hoc XAI techniques. We identify 20 evaluation criteria and associated evaluation methods, and derive guidelines on when and how each criterion should be evaluated. We also identify key research gaps to be addressed by future work. Our study contributes to the body of knowledge on XAI evaluation through in-depth examination of functionally-grounded XAI evaluation protocols, and has laid the groundwork for future research on XAI evaluation.

CVFeb 6, 2023
Integrating Eye-Gaze Data into CXR DL Approaches: A Preliminary study

André Luís, Chihcheng Hsieh, Isabel Blanco Nobre et al.

This paper proposes a novel multimodal DL architecture incorporating medical images and eye-tracking data for abnormality detection in chest x-rays. Our results show that applying eye gaze data directly into DL architectures does not show superior predictive performance in abnormality detection chest X-rays. These results support other works in the literature and suggest that human-generated data, such as eye gaze, needs a more thorough investigation before being applied to DL architectures.

AIJul 11, 2024
DALL-M: Context-Aware Clinical Data Augmentation with LLMs

Chihcheng Hsieh, Catarina Moreira, Isabel Blanco Nobre et al.

X-ray images are vital in medical diagnostics, but their effectiveness is limited without clinical context. Radiologists often find chest X-rays insufficient for diagnosing underlying diseases, necessitating the integration of structured clinical features with radiology reports. To address this, we introduce DALL-M, a novel framework that enhances clinical datasets by generating contextual synthetic data. DALL-M augments structured patient data, including vital signs (e.g., heart rate, oxygen saturation), radiology findings (e.g., lesion presence), and demographic factors. It integrates this tabular data with contextual knowledge extracted from radiology reports and domain-specific resources (e.g., Radiopaedia, Wikipedia), ensuring clinical consistency and reliability. DALL-M follows a three-phase process: (i) clinical context storage, (ii) expert query generation, and (iii) context-aware feature augmentation. Using large language models (LLMs), it generates both contextual synthetic values for existing clinical features and entirely new, clinically relevant features. Applied to 799 cases from the MIMIC-IV dataset, DALL-M expanded the original 9 clinical features to 91. Empirical validation with machine learning models (including Decision Trees, Random Forests, XGBoost, and TabNET) demonstrated a 16.5% improvement in F1 score and a 25% increase in Precision and Recall. DALL-M bridges an important gap in clinical data augmentation by preserving data integrity while enhancing predictive modeling in healthcare. Our results show that integrating LLM-generated synthetic features significantly improves model performance, making DALL-M a scalable and practical approach for AI-driven medical diagnostics.

AIOct 21, 2024Code
PROMPTHEUS: A Human-Centered Pipeline to Streamline SLRs with LLMs

João Pedro Fernandes Torres, Catherine Mulligan, Joaquim Jorge et al.

The growing volume of academic publications poses significant challenges for researchers conducting timely and accurate Systematic Literature Reviews, particularly in fast-evolving fields like artificial intelligence. This growth of academic literature also makes it increasingly difficult for lay people to access scientific knowledge effectively, meaning academic literature is often misrepresented in the popular press and, more broadly, in society. Traditional SLR methods are labor-intensive and error-prone, and they struggle to keep up with the rapid pace of new research. To address these issues, we developed \textit{PROMPTHEUS}: an AI-driven pipeline solution that automates the SLR process using Large Language Models. We aimed to enhance efficiency by reducing the manual workload while maintaining the precision and coherence required for comprehensive literature synthesis. PROMPTHEUS automates key stages of the SLR process, including systematic search, data extraction, topic modeling using BERTopic, and summarization with transformer models. Evaluations conducted across five research domains demonstrate that PROMPTHEUS reduces review time, achieves high precision, and provides coherent topic organization, offering a scalable and effective solution for conducting literature reviews in an increasingly crowded research landscape. In addition, such tools may reduce the increasing mistrust in science by making summarization more accessible to laypeople. The code for this project can be found on the GitHub repository at https://github.com/joaopftorres/PROMPTHEUS.git

AIDec 8, 2020Code
Evaluating Explainable Methods for Predictive Process Analytics: A Functionally-Grounded Approach

Mythreyi Velmurugan, Chun Ouyang, Catarina Moreira et al.

Predictive process analytics focuses on predicting the future states of running instances of a business process. While advanced machine learning techniques have been used to increase accuracy of predictions, the resulting predictive models lack transparency. Current explainable machine learning methods, such as LIME and SHAP, can be used to interpret black box models. However, it is unclear how fit for purpose these methods are in explaining process predictive models. In this paper, we draw on evaluation measures used in the field of explainable AI and propose functionally-grounded evaluation metrics for assessing explainable methods in predictive process analytics. We apply the proposed metrics to evaluate the performance of LIME and SHAP in interpreting process predictive models built on XGBoost, which has been shown to be relatively accurate in process predictions. We conduct the evaluation using three open source, real-world event logs and analyse the evaluation results to derive insights. The research contributes to understanding the trustworthiness of explainable methods for predictive process analytics as a fundamental and key step towards human user-oriented evaluation.

AIJul 21, 2020Code
An Interpretable Probabilistic Approach for Demystifying Black-box Predictive Models

Catarina Moreira, Yu-Liang Chou, Mythreyi Velmurugan et al.

The use of sophisticated machine learning models for critical decision making is faced with a challenge that these models are often applied as a "black-box". This has led to an increased interest in interpretable machine learning, where post hoc interpretation presents a useful mechanism for generating interpretations of complex learning models. In this paper, we propose a novel approach underpinned by an extended framework of Bayesian networks for generating post hoc interpretations of a black-box predictive model. The framework supports extracting a Bayesian network as an approximation of the black-box model for a specific prediction. Compared to the existing post hoc interpretation methods, the contribution of our approach is three-fold. Firstly, the extracted Bayesian network, as a probabilistic graphical model, can provide interpretations about not only what input features but also why these features contributed to a prediction. Secondly, for complex decision problems with many features, a Markov blanket can be generated from the extracted Bayesian network to provide interpretations with a focused view on those input features that directly contributed to a prediction. Thirdly, the extracted Bayesian network enables the identification of four different rules which can inform the decision-maker about the confidence level in a prediction, thus helping the decision-maker assess the reliability of predictions learned by a black-box model. We implemented the proposed approach, applied it in the context of two well-known public datasets and analysed the results, which are made available in an open-source repository.

CYJun 2, 2025
Explainable AI Systems Must Be Contestable: Here's How to Make It Happen

Catarina Moreira, Anna Palatkina, Dacia Braca et al.

As AI regulations around the world intensify their focus on system safety, contestability has become a mandatory, yet ill-defined, safeguard. In XAI, "contestability" remains an empty promise: no formal definition exists, no algorithm guarantees it, and practitioners lack concrete guidance to satisfy regulatory requirements. Grounded in a systematic literature review, this paper presents the first rigorous formal definition of contestability in explainable AI, directly aligned with stakeholder requirements and regulatory mandates. We introduce a modular framework of by-design and post-hoc mechanisms spanning human-centered interfaces, technical architectures, legal processes, and organizational workflows. To operationalize our framework, we propose the Contestability Assessment Scale, a composite metric built on more than twenty quantitative criteria. Through multiple case studies across diverse application domains, we reveal where state-of-the-art systems fall short and show how our framework drives targeted improvements. By converting contestability from regulatory theory into a practical framework, our work equips practitioners with the tools to embed genuine recourse and accountability into AI systems.

LGMay 27, 2025
TabAttackBench: A Benchmark for Adversarial Attacks on Tabular Data

Zhipeng He, Chun Ouyang, Lijie Wen et al.

Adversarial attacks pose a significant threat to machine learning models by inducing incorrect predictions through imperceptible perturbations to input data. While these attacks are well studied in unstructured domains such as images, their behaviour on tabular data remains underexplored due to mixed feature types and complex inter-feature dependencies. This study introduces a comprehensive benchmark that evaluates adversarial attacks on tabular datasets with respect to both effectiveness and imperceptibility. We assess five white-box attack algorithms (FGSM, BIM, PGD, DeepFool, and C\&W) across four representative models (LR, MLP, TabTransformer and FT-Transformer) using eleven datasets spanning finance, energy, and healthcare domains. The benchmark employs four quantitative imperceptibility metrics (proximity, sparsity, deviation, and sensitivity) to characterise perturbation realism. The analysis quantifies the trade-off between these two aspects and reveals consistent differences between attack types, with $\ell_\infty$-based attacks achieving higher success but lower subtlety, and $\ell_2$-based attacks offering more realistic perturbations. The benchmark findings offer actionable insights for designing more imperceptible adversarial attacks, advancing the understanding of adversarial vulnerability in tabular machine learning.

32.5CLMar 12
Shattering the Shortcut: A Topology-Regularized Benchmark for Multi-hop Medical Reasoning in LLMs

Xing Zi, Xinying Zhou, Jinghao Xiao et al.

While Large Language Models (LLMs) achieve expert-level performance on standard medical benchmarks through single-hop factual recall, they severely struggle with the complex, multi-hop diagnostic reasoning required in real-world clinical settings. A primary obstacle is "shortcut learning", where models exploit highly connected, generic hub nodes (e.g., "inflammation") in knowledge graphs to bypass authentic micro-pathological cascades. To address this, we introduce ShatterMed-QA, a bilingual benchmark of 10,558 multi-hop clinical questions designed to rigorously evaluate deep diagnostic reasoning. Our framework constructs a topology-regularized medical Knowledge Graph using a novel $k$-Shattering algorithm, which physically prunes generic hubs to explicitly sever logical shortcuts. We synthesize the evaluation vignettes by applying implicit bridge entity masking and topology-driven hard negative sampling, forcing models to navigate biologically plausible distractors without relying on superficial elimination. Comprehensive evaluations of 21 LLMs reveal massive performance degradation on our multi-hop tasks, particularly among domain-specific models. Crucially, restoring the masked evidence via Retrieval-Augmented Generation (RAG) triggers near-universal performance recovery, validating ShatterMed-QA's structural fidelity and proving its efficacy in diagnosing the fundamental reasoning deficits of current medical AI. Explore the dataset, interactive examples, and full leaderboards at our project website: https://shattermed-qa-web.vercel.app/

LGJul 15, 2025
Crafting Imperceptible On-Manifold Adversarial Attacks for Tabular Data

Zhipeng He, Alexander Stevens, Chun Ouyang et al.

Adversarial attacks on tabular data present unique challenges due to the heterogeneous nature of mixed categorical and numerical features. Unlike images where pixel perturbations maintain visual similarity, tabular data lacks intuitive similarity metrics, making it difficult to define imperceptible modifications. Additionally, traditional gradient-based methods prioritise $\ell_p$-norm constraints, often producing adversarial examples that deviate from the original data distributions. To address this, we propose a latent-space perturbation framework using a mixed-input Variational Autoencoder (VAE) to generate statistically consistent adversarial examples. The proposed VAE integrates categorical embeddings and numerical features into a unified latent manifold, enabling perturbations that preserve statistical consistency. We introduce In-Distribution Success Rate (IDSR) to jointly evaluate attack effectiveness and distributional alignment. Evaluation across six publicly available datasets and three model architectures demonstrates that our method achieves substantially lower outlier rates and more consistent performance compared to traditional input-space attacks and other VAE-based methods adapted from image domain approaches, achieving substantially lower outlier rates and higher IDSR across six datasets and three model architectures. Our comprehensive analyses of hyperparameter sensitivity, sparsity control, and generative architecture demonstrate that the effectiveness of VAE-based attacks depends strongly on reconstruction quality and the availability of sufficient training data. When these conditions are met, the proposed framework achieves superior practical utility and stability compared with input-space methods. This work underscores the importance of maintaining on-manifold perturbations for generating realistic and robust adversarial examples in tabular domains.

IVFeb 25, 2025
TagGAN: A Generative Model for Data Tagging

Muhammad Nawaz, Basma Nasir, Tehseen Zia et al.

Precise identification and localization of disease-specific features at the pixel-level are particularly important for early diagnosis, disease progression monitoring, and effective treatment in medical image analysis. However, conventional diagnostic AI systems lack decision transparency and cannot operate well in environments where there is a lack of pixel-level annotations. In this study, we propose a novel Generative Adversarial Networks (GANs)-based framework, TagGAN, which is tailored for weakly-supervised fine-grained disease map generation from purely image-level labeled data. TagGAN generates a pixel-level disease map during domain translation from an abnormal image to a normal representation. Later, this map is subtracted from the input abnormal image to convert it into its normal counterpart while preserving all the critical anatomical details. Our method is first to generate fine-grained disease maps to visualize disease lesions in a weekly supervised setting without requiring pixel-level annotations. This development enhances the interpretability of diagnostic AI by providing precise visualizations of disease-specific regions. It also introduces automated binary mask generation to assist radiologists. Empirical evaluations carried out on the benchmark datasets, CheXpert, TBX11K, and COVID-19, demonstrate the capability of TagGAN to outperform current top models in accurately identifying disease-specific pixels. This outcome highlights the capability of the proposed model to tag medical images, significantly reducing the workload for radiologists by eliminating the need for binary masks during training.

CVFeb 25, 2025
Weakly Supervised Pixel-Level Annotation with Visual Interpretability

Basma Nasir, Tehseen Zia, Muhammad Nawaz et al.

Medical image annotation is essential for diagnosing diseases, yet manual annotation is time-consuming, costly, and prone to variability among experts. To address these challenges, we propose an automated explainable annotation system that integrates ensemble learning, visual explainability, and uncertainty quantification. Our approach combines three pre-trained deep learning models - ResNet50, EfficientNet, and DenseNet - enhanced with XGrad-CAM for visual explanations and Monte Carlo Dropout for uncertainty quantification. This ensemble mimics the consensus of multiple radiologists by intersecting saliency maps from models that agree on the diagnosis while uncertain predictions are flagged for human review. We evaluated our system using the TBX11K medical imaging dataset and a Fire segmentation dataset, demonstrating its robustness across different domains. Experimental results show that our method outperforms baseline models, achieving 93.04% accuracy on TBX11K and 96.4% accuracy on the Fire dataset. Moreover, our model produces precise pixel-level annotations despite being trained with only image-level labels, achieving Intersection over Union IoU scores of 36.07% and 64.7%, respectively. By enhancing the accuracy and interpretability of image annotations, our approach offers a reliable and transparent solution for medical diagnostics and other image analysis tasks.

CVJun 5, 2024
SelfReDepth: Self-Supervised Real-Time Depth Restoration for Consumer-Grade Sensors

Alexandre Duarte, Francisco Fernandes, João M. Pereira et al.

Depth maps produced by consumer-grade sensors suffer from inaccurate measurements and missing data from either system or scene-specific sources. Data-driven denoising algorithms can mitigate such problems. However, they require vast amounts of ground truth depth data. Recent research has tackled this limitation using self-supervised learning techniques, but it requires multiple RGB-D sensors. Moreover, most existing approaches focus on denoising single isolated depth maps or specific subjects of interest, highlighting a need for methods to effectively denoise depth maps in real-time dynamic environments. This paper extends state-of-the-art approaches for depth-denoising commodity depth devices, proposing SelfReDepth, a self-supervised deep learning technique for depth restoration, via denoising and hole-filling by inpainting full-depth maps captured with RGB-D sensors. The algorithm targets depth data in video streams, utilizing multiple sequential depth frames coupled with color data to achieve high-quality depth videos with temporal coherence. Finally, SelfReDepth is designed to be compatible with various RGB-D sensors and usable in real-time scenarios as a pre-processing step before applying other depth-dependent algorithms. Our results demonstrate our approach's real-time performance on real-world datasets. They show that it outperforms state-of-the-art denoising and restoration performance at over 30fps on Commercial Depth Cameras, with potential benefits for augmented and mixed-reality applications.

AIMar 14, 2024
Generating Feasible and Plausible Counterfactual Explanations for Outcome Prediction of Business Processes

Alexander Stevens, Chun Ouyang, Johannes De Smedt et al.

In recent years, various machine and deep learning architectures have been successfully introduced to the field of predictive process analytics. Nevertheless, the inherent opacity of these algorithms poses a significant challenge for human decision-makers, hindering their ability to understand the reasoning behind the predictions. This growing concern has sparked the introduction of counterfactual explanations, designed as human-understandable what if scenarios, to provide clearer insights into the decision-making process behind undesirable predictions. The generation of counterfactual explanations, however, encounters specific challenges when dealing with the sequential nature of the (business) process cases typically used in predictive process analytics. Our paper tackles this challenge by introducing a data-driven approach, REVISEDplus, to generate more feasible and plausible counterfactual explanations. First, we restrict the counterfactual algorithm to generate counterfactuals that lie within a high-density region of the process data, ensuring that the proposed counterfactuals are realistic and feasible within the observed process data distribution. Additionally, we ensure plausibility by learning sequential patterns between the activities in the process cases, utilising Declare language templates. Finally, we evaluate the properties that define the validity of counterfactuals.

NCFeb 15, 2022
An Extension Of Combinatorial Contextuality For Cognitive Protocols

Abdul Karim Obeid, Peter Bruza, Catarina Moreira et al.

This article extends the combinatorial approach to support the determination of contextuality amidst causal influences. Contextuality is an active field of study in Quantum Cognition, in systems relating to mental phenomena, such as concepts in human memory [Aerts et al., 2013]. In the cognitive field of study, a contemporary challenge facing the determination of whether a phenomenon is contextual has been the identification and management of disturbances [Dzhafarov et al., 2016]. Whether or not said disturbances are identified through the modelling approach, constitute causal influences, or are disregardableas as noise is important, as contextuality cannot be adequately determined in the presence of causal influences [Gleason, 1957]. To address this challenge, we first provide a formalisation of necessary elements of the combinatorial approach within the language of canonical9 causal models. Through this formalisation, we extend the combinatorial approach to support a measurement and treatment of disturbance, and offer techniques to separately distinguish noise and causal influences. Thereafter, we develop a protocol through which these elements may be represented within a cognitive experiment. As human cognition seems rife with causal influences, cognitive modellers may apply the extended combinatorial approach to practically determine the contextuality of cognitive phenomena.

LGSep 3, 2021
Building Interpretable Models for Business Process Prediction using Shared and Specialised Attention Mechanisms

Bemali Wickramanayake, Zhipeng He, Chun Ouyang et al.

In this paper, we address the "black-box" problem in predictive process analytics by building interpretable models that are capable to inform both what and why is a prediction. Predictive process analytics is a newly emerged discipline dedicated to providing business process intelligence in modern organisations. It uses event logs, which capture process execution traces in the form of multi-dimensional sequence data, as the key input to train predictive models. These predictive models, often built upon deep learning techniques, can be used to make predictions about the future states of business process execution. We apply attention mechanism to achieve model interpretability. We propose i) two types of attentions: event attention to capture the impact of specific process events on a prediction, and attribute attention to reveal which attribute(s) of an event influenced the prediction; and ii) two attention mechanisms: shared attention mechanism and specialised attention mechanism to reflect different design decisions in when to construct attribute attention on individual input features (specialised) or using the concatenated feature tensor of all input feature vectors (shared). These lead to two distinct attention-based models, and both are interpretable models that incorporate interpretability directly into the structure of a process predictive model. We conduct experimental evaluation of the proposed models using real-life dataset, and comparative analysis between the models for accuracy and interpretability, and draw insights from the evaluation and analysis results.

LGJul 19, 2021
DiCE4EL: Interpreting Process Predictions using a Milestone-Aware Counterfactual Approach

Chihcheng Hsieh, Catarina Moreira, Chun Ouyang

Predictive process analytics often apply machine learning to predict the future states of a running business~process. However, the internal mechanisms of many existing predictive algorithms are opaque and a human decision-maker is unable to understand \emph{why} a certain activity was predicted. Recently, counterfactuals have been proposed in the literature to derive human-understandable explanations from predictive models. Current counterfactual approaches consist of finding the minimum feature change that can make a certain prediction flip its outcome. Although many algorithms have been proposed, their application to multi-dimensional sequence data like event logs has not been explored in the literature. In this paper, we explore the use of a recent, popular model-agnostic counterfactual algorithm, DiCE, in the context of predictive process analytics. The analysis reveals that DiCE is unable to derive explanations for process predictions, due to (1) process domain knowledge not being taken into account, (2) long traces of process execution that often tend to be less understandable, and (3) difficulties in optimising the counterfactual search with categorical variables. We design an extension of DiCE, namely DiCE4EL (DiCE for Event Logs), that can generate counterfactual explanations for process prediction, and propose an approach that supports deriving milestone-aware counterfactual explanations at key intermediate stages along process execution to promote interpretability. We apply our approach to a publicly available real-life event log and the analysis results demonstrate the effectiveness of the proposed approach.

AIJul 16, 2021
Explainable AI Enabled Inspection of Business Process Prediction Models

Chun Ouyang, Renuka Sindhgatta, Catarina Moreira

Modern data analytics underpinned by machine learning techniques has become a key enabler to the automation of data-led decision making. As an important branch of state-of-the-art data analytics, business process predictions are also faced with a challenge in regard to the lack of explanation to the reasoning and decision by the underlying `black-box' prediction models. With the development of interpretable machine learning techniques, explanations can be generated for a black-box model, making it possible for (human) users to access the reasoning behind machine learned predictions. In this paper, we aim to present an approach that allows us to use model explanations to investigate certain reasoning applied by machine learned predictions and detect potential issues with the underlying methods thus enhancing trust in business process prediction models. A novel contribution of our approach is the proposal of model inspection that leverages both the explanations generated by interpretable machine learning mechanisms and the contextual or domain knowledge extracted from event logs that record historical process execution. Findings drawn from this work are expected to serve as a key input to developing model reliability metrics and evaluation in the context of business process predictions.

LGJun 16, 2021
Developing a Fidelity Evaluation Approach for Interpretable Machine Learning

Mythreyi Velmurugan, Chun Ouyang, Catarina Moreira et al.

Although modern machine learning and deep learning methods allow for complex and in-depth data analytics, the predictive models generated by these methods are often highly complex, and lack transparency. Explainable AI (XAI) methods are used to improve the interpretability of these complex models, and in doing so improve transparency. However, the inherent fitness of these explainable methods can be hard to evaluate. In particular, methods to evaluate the fidelity of the explanation to the underlying black box require further development, especially for tabular data. In this paper, we (a) propose a three phase approach to developing an evaluation method; (b) adapt an existing evaluation method primarily for image and text data to evaluate models trained on tabular data; and (c) evaluate two popular explainable methods using this evaluation method. Our evaluations suggest that the internal mechanism of the underlying predictive model, the internal mechanism of the explainable method used and model and data complexity all affect explanation fidelity. Given that explanation fidelity is so sensitive to context and tools and data used, we could not clearly identify any specific explainable method as being superior to another.

AIMay 16, 2021
Order Effects in Bayesian Updates

Catarina Moreira, Jose Acacio de Barros

Order effects occur when judgments about a hypothesis's probability given a sequence of information do not equal the probability of the same hypothesis when the information is reversed. Different experiments have been performed in the literature that supports evidence of order effects. We proposed a Bayesian update model for order effects where each question can be thought of as a mini-experiment where the respondents reflect on their beliefs. We showed that order effects appear, and they have a simple cognitive explanation: the respondent's prior belief that two questions are correlated. The proposed Bayesian model allows us to make several predictions: (1) we found certain conditions on the priors that limit the existence of order effects; (2) we show that, for our model, the QQ equality is not necessarily satisfied (due to symmetry assumptions); and (3) the proposed Bayesian model has the advantage of possessing fewer parameters than its quantum counterpart.

AIMar 7, 2021
Counterfactuals and Causability in Explainable Artificial Intelligence: Theory, Algorithms, and Applications

Yu-Liang Chou, Catarina Moreira, Peter Bruza et al.

There has been a growing interest in model-agnostic methods that can make deep learning models more transparent and explainable to a user. Some researchers recently argued that for a machine to achieve a certain degree of human-level explainability, this machine needs to provide human causally understandable explanations, also known as causability. A specific class of algorithms that have the potential to provide causability are counterfactuals. This paper presents an in-depth systematic review of the diverse existing body of literature on counterfactuals and causability for explainable artificial intelligence. We performed an LDA topic modelling analysis under a PRISMA framework to find the most relevant literature articles. This analysis resulted in a novel taxonomy that considers the grounding theories of the surveyed algorithms, together with their underlying properties and applications in real-world data. This research suggests that current model-agnostic counterfactual algorithms for explainable AI are not grounded on a causal theoretical formalism and, consequently, cannot promote causability to a human decision-maker. Our findings suggest that the explanations derived from major algorithms in the literature provide spurious correlations rather than cause/effects relationships, leading to sub-optimal, erroneous or even biased explanations. This paper also advances the literature with new directions and challenges on promoting causability in model-agnostic approaches for explainable artificial intelligence.

LGJun 2, 2020
Construction of 'Support Vector' Machine Feature Spaces via Deformed Weyl-Heisenberg Algebra

Shahram Dehdashti, Catarina Moreira, Abdul Karim Obeid et al.

This paper uses deformed coherent states, based on a deformed Weyl-Heisenberg algebra that unifies the well-known SU(2), Weyl-Heisenberg, and SU(1,1) groups, through a common parameter. We show that deformed coherent states provide the theoretical foundation of a meta-kernel function, that is a kernel which in turn defines kernel functions. Kernel functions drive developments in the field of machine learning and the meta-kernel function presented in this paper opens new theoretical avenues for the definition and exploration of kernel functions. The meta-kernel function applies associated revolution surfaces as feature spaces identified with non-linear coherent states. An empirical investigation compares the deformed SU(2) and SU(1,1) kernels derived from the meta-kernel which shows performance similar to the Radial Basis kernel, and offers new insights (based on the deformed Weyl-Heisenberg algebra).

AIMay 30, 2020
QuLBIT: Quantum-Like Bayesian Inference Technologies for Cognition and Decision

Catarina Moreira, Matheus Hammes, Rasim Serdar Kurdoglu et al.

This paper provides the foundations of a unified cognitive decision-making framework (QulBIT) which is derived from quantum theory. The main advantage of this framework is that it can cater for paradoxical and irrational human decision making. Although quantum approaches for cognition have demonstrated advantages over classical probabilistic approaches and bounded rationality models, they still lack explanatory power. To address this, we introduce a novel explanatory analysis of the decision-maker's belief space. This is achieved by exploiting quantum interference effects as a way of both quantifying and explaining the decision-maker's uncertainty. We detail the main modules of the unified framework, the explanatory analysis method, and illustrate their application in situations violating the Sure Thing Principle.

LGFeb 21, 2020
An Investigation of Interpretability Techniques for Deep Learning in Predictive Process Analytics

Catarina Moreira, Renuka Sindhgatta, Chun Ouyang et al.

This paper explores interpretability techniques for two of the most successful learning algorithms in medical decision-making literature: deep neural networks and random forests. We applied these algorithms in a real-world medical dataset containing information about patients with cancer, where we learn models that try to predict the type of cancer of the patient, given their set of medical activity records. We explored different algorithms based on neural network architectures using long short term deep neural networks, and random forests. Since there is a growing need to provide decision-makers understandings about the logic of predictions of black boxes, we also explored different techniques that provide interpretations for these classifiers. In one of the techniques, we intercepted some hidden layers of these neural networks and used autoencoders in order to learn what is the representation of the input in the hidden layers. In another, we investigated an interpretable model locally around the random forest's prediction. Results show learning an interpretable model locally around the model's prediction leads to a higher understanding of why the algorithm is making some decision. Use of local and linear model helps identify the features used in prediction of a specific instance or data point. We see certain distinct features used for predictions that provide useful insights about the type of cancer, along with features that do not generalize well. In addition, the structured deep learning approach using autoencoders provided meaningful prediction insights, which resulted in the identification of nonlinear clusters correspondent to the patients' different types of cancer.

LGDec 22, 2019
Exploring Interpretability for Predictive Process Analytics

Renuka Sindhgatta, Chun Ouyang, Catarina Moreira

Modern predictive analytics underpinned by machine learning techniques has become a key enabler to the automation of data-driven decision making. In the context of business process management, predictive analytics has been applied to making predictions about the future state of an ongoing business process instance, for example, when will the process instance complete and what will be the outcome upon completion. Machine learning models can be trained on event log data recording historical process execution to build the underlying predictive models. Multiple techniques have been proposed so far which encode the information available in an event log and construct input features required to train a predictive model. While accuracy has been a dominant criterion in the choice of various techniques, they are often applied as a black-box in building predictive models. In this paper, we derive explanations using interpretable machine learning techniques to compare and contrast the suitability of multiple predictive models of high accuracy. The explanations allow us to gain an understanding of the underlying reasons for a prediction and highlight scenarios where accuracy alone may not be sufficient in assessing the suitability of techniques used to encode event log data to features used by a predictive model. Findings from this study motivate the need and importance to incorporate interpretability in predictive process analytics.

AIMay 11, 2019
Towards a Quantum-Like Cognitive Architecture for Decision-Making

Catarina Moreira, Lauren Fell, Shahram Dehdashti et al.

We propose an alternative and unifying framework for decision-making that, by using quantum mechanics, provides more generalised cognitive and decision models with the ability to represent more information than classical models. This framework can accommodate and predict several cognitive biases reported in Lieder & Griffiths without heavy reliance on heuristics nor on assumptions of the computational resources of the mind.

AINov 29, 2018
Unifying Decision-Making: a Review on Evolutionary Theories on Rationality and Cognitive Biases

Catarina Moreira

In this paper, we make a review on the concepts of rationality across several different fields, namely in economics, psychology and evolutionary biology and behavioural ecology. We review how processes like natural selection can help us understand the evolution of cognition and how cognitive biases might be a consequence of this natural selection. In the end we argue that humans are not irrational, but rather rationally bounded and we complement the discussion on how quantum cognitive models can contribute for the modelling and prediction of human paradoxical decisions.

AIJul 16, 2018
Introducing Quantum-Like Influence Diagrams for Violations of the Sure Thing Principle

Catarina Moreira, Andreas Wichert

It is the focus of this work to extend and study the previously proposed quantum-like Bayesian networks to deal with decision-making scenarios by incorporating the notion of maximum expected utility in influence diagrams. The general idea is to take advantage of the quantum interference terms produced in the quantum-like Bayesian Network to influence the probabilities used to compute the expected utility of some action. This way, we are not proposing a new type of expected utility hypothesis. On the contrary, we are keeping it under its classical definition. We are only incorporating it as an extension of a probabilistic graphical model in a compact graphical representation called an influence diagram in which the utility function depends on the probabilistic influences of the quantum-like Bayesian network. Our findings suggest that the proposed quantum-like influence digram can indeed take advantage of the quantum interference effects of quantum-like Bayesian Networks to maximise the utility of a cooperative behaviour in detriment of a fully rational defect behaviour under the prisoner's dilemma game.

AIOct 2, 2017
The Dutch's Real World Financial Institute: Introducing Quantum-Like Bayesian Networks as an Alternative Model to deal with Uncertainty

Catarina Moreira, Emmanuel Haven, Sandro Sozzo et al.

In this work, we analyse and model a real life financial loan application belonging to a sample bank in the Netherlands. The log is robust in terms of data, containing a total of 262 200 event logs, belonging to 13 087 different credit applications. The dataset is heterogeneous and consists of a mixture of computer generated automatic processes and manual human tasks. The goal is to work out a decision model, which represents the underlying tasks that make up the loan application service, and to assess potential areas of improvement of the institution's internal processes. To this end we study the impact of incomplete event logs for the extraction and analysis of business processes. It is quite common that event logs are incomplete with several amounts of missing information (for instance, workers forget to register their tasks). Absence of data is translated into a drastic decrease of precision and compromises the decision models, leading to biased and unrepresentative results. We investigate how classical probabilistic models are affected by incomplete event logs and we explore quantum-like probabilistic inferences as an alternative mathematical model to classical probability. This work represents a first step towards systematic investigation of the impact of quantum interference in a real life large scale decision scenario. The results obtained in this study indicate that, under high levels of uncertainty, the quantum-like models generate quantum interference terms, which allow an additional non-linear parameterisation of the data. Experimental results attest the efficiency of the quantum-like Bayesian networks, since the application of interference terms is able to reduce the error percentage of inferences performed over quantum-like models when compared to inferences produced by classical models.

AIAug 26, 2015
The Relation Between Acausality and Interference in Quantum-Like Bayesian Networks

Catarina Moreira, Andreas Wichert

We analyse a quantum-like Bayesian Network that puts together cause/effect relationships and semantic similarities between events. These semantic similarities constitute acausal connections according to the Synchronicity principle and provide new relationships to quantum like probabilistic graphical models. As a consequence, beliefs (or any other event) can be represented in vector spaces, in which quantum parameters are determined by the similarities that these vectors share between them. Events attached by a semantic meaning do not need to have an explanation in terms of cause and effect.

AIMar 25, 2015
An Experiment on Using Bayesian Networks for Process Mining

Catarina Moreira

Process mining is a technique that performs an automatic analysis of business processes from a log of events with the promise of understanding how processes are executed in an organisation. Several models have been proposed to address this problem, however, here we propose a different approach to deal with uncertainty. By uncertainty, we mean estimating the probability of some sequence of tasks occurring in a business process, given that only a subset of tasks may be observable. In this sense, this work proposes a new approach to perform process mining using Bayesian Networks. These structures can take into account the probability of a task being present or absent in the business process. Moreover, Bayesian Networks are able to automatically learn these probabilities through mechanisms such as the maximum likelihood estimate and EM clustering. Experiments made over a Loan Application Case study suggest that Bayesian Networks are adequate structures for process mining and enable a deep analysis of the business process model that can be used to answer queries about that process.

IRFeb 12, 2015
On Projection Based Operators in Lp space for Exact Similarity Search

Andreas Wichert, Catarina Moreira

We investigate exact indexing for high dimensional Lp norms based on the 1-Lipschitz property and projection operators. The orthogonal projection that satisfies the 1-Lipschitz property for the Lp norm is described. The adaptive projection defined by the first principal component is introduced.

IRJan 21, 2015
Using Rank Aggregation for Expert Search in Academic Digital Libraries

Catarina Moreira, Bruno Martins, Pável Calado

The task of expert finding has been getting increasing attention in information retrieval literature. However, the current state-of-the-art is still lacking in principled approaches for combining different sources of evidence. This paper explores the usage of unsupervised rank aggregation methods as a principled approach for combining multiple estimators of expertise, derived from the textual contents, from the graph-structure of the citation patterns for the community of experts, and from profile information about the experts. We specifically experimented two unsupervised rank aggregation approaches well known in the information retrieval literature, namely CombSUM and CombMNZ. Experiments made over a dataset of academic publications for the area of Computer Science attest for the adequacy of these methods.

IRJan 21, 2015
Learning to Rank Academic Experts in the DBLP Dataset

Catarina Moreira, Bruno Martins, Pável Calado

Expert finding is an information retrieval task that is concerned with the search for the most knowledgeable people with respect to a specific topic, and the search is based on documents that describe people's activities. The task involves taking a user query as input and returning a list of people who are sorted by their level of expertise with respect to the user query. Despite recent interest in the area, the current state-of-the-art techniques lack in principled approaches for optimally combining different sources of evidence. This article proposes two frameworks for combining multiple estimators of expertise. These estimators are derived from textual contents, from graph-structure of the citation patterns for the community of experts, and from profile information about the experts. More specifically, this article explores the use of supervised learning to rank methods, as well as rank aggregation approaches, for combing all of the estimators of expertise. Several supervised learning algorithms, which are representative of the pointwise, pairwise and listwise approaches, were tested, and various state-of-the-art data fusion techniques were also explored for the rank aggregation framework. Experiments that were performed on a dataset of academic publications from the Computer Science domain attest the adequacy of the proposed approaches.

AISep 30, 2014
Interference Effects in Quantum Belief Networks

Catarina Moreira, Andreas Wichert

Probabilistic graphical models such as Bayesian Networks are one of the most powerful structures known by the Computer Science community for deriving probabilistic inferences. However, modern cognitive psychology has revealed that human decisions could not follow the rules of classical probability theory, because humans cannot process large amounts of data in order to make judgements. Consequently, the inferences performed are based on limited data coupled with several heuristics, leading to violations of the law of total probability. This means that probabilistic graphical models based on classical probability theory are too limited to fully simulate and explain various aspects of human decision making. Quantum probability theory was developed in order to accommodate the paradoxical findings that the classical theory could not explain. Recent findings in cognitive psychology revealed that quantum probability can fully describe human decisions in an elegant framework. Their findings suggest that, before taking a decision, human thoughts are seen as superposed waves that can interfere with each other, influencing the final decision. In this work, we propose a new Bayesian Network based on the psychological findings of cognitive scientists. We made experiments with two very well known Bayesian Networks from the literature. The results obtained revealed that the quantum like Bayesian Network can affect drastically the probabilistic inferences, specially when the levels of uncertainty of the network are very high (no pieces of evidence observed). When the levels of uncertainty are very low, then the proposed quantum like network collapses to its classical counterpart.

AIJun 12, 2013
Finding Academic Experts on a MultiSensor Approach using Shannon's Entropy

Catarina Moreira, Andreas Wichert

Expert finding is an information retrieval task concerned with the search for the most knowledgeable people, in some topic, with basis on documents describing peoples activities. The task involves taking a user query as input and returning a list of people sorted by their level of expertise regarding the user query. This paper introduces a novel approach for combining multiple estimators of expertise based on a multisensor data fusion framework together with the Dempster-Shafer theory of evidence and Shannon's entropy. More specifically, we defined three sensors which detect heterogeneous information derived from the textual contents, from the graph structure of the citation patterns for the community of experts, and from profile information about the academic experts. Given the evidences collected, each sensor may define different candidates as experts and consequently do not agree in a final ranking decision. To deal with these conflicts, we applied the Dempster-Shafer theory of evidence combined with Shannon's Entropy formula to fuse this information and come up with a more accurate and reliable final ranking list. Experiments made over two datasets of academic publications from the Computer Science domain attest for the adequacy of the proposed approach over the traditional state of the art approaches. We also made experiments against representative supervised state of the art algorithms. Results revealed that the proposed method achieved a similar performance when compared to these supervised techniques, confirming the capabilities of the proposed framework.

CLFeb 6, 2013
Towards the Rapid Development of a Natural Language Understanding Module

Catarina Moreira, Ana Cristina Mendes, Luísa Coheur et al.

When developing a conversational agent, there is often an urgent need to have a prototype available in order to test the application with real users. A Wizard of Oz is a possibility, but sometimes the agent should be simply deployed in the environment where it will be used. Here, the agent should be able to capture as many interactions as possible and to understand how people react to failure. In this paper, we focus on the rapid development of a natural language understanding module by non experts. Our approach follows the learning paradigm and sees the process of understanding natural language as a classification problem. We test our module with a conversational agent that answers questions in the art domain. Moreover, we show how our approach can be used by a natural language interface to a cinema database.

IRFeb 2, 2013
Learning to Rank for Expert Search in Digital Libraries of Academic Publications

Catarina Moreira, Pável Calado, Bruno Martins

The task of expert finding has been getting increasing attention in information retrieval literature. However, the current state-of-the-art is still lacking in principled approaches for combining different sources of evidence in an optimal way. This paper explores the usage of learning to rank methods as a principled approach for combining multiple estimators of expertise, derived from the textual contents, from the graph-structure with the citation patterns for the community of experts, and from profile information about the experts. Experiments made over a dataset of academic publications, for the area of Computer Science, attest for the adequacy of the proposed approaches.