AIOct 30, 2023
Explainable Artificial Intelligence (XAI) 2.0: A Manifesto of Open Challenges and Interdisciplinary Research DirectionsLuca Longo, Mario Brcic, Federico Cabitza et al.
As systems based on opaque Artificial Intelligence (AI) continue to flourish in diverse real-world applications, understanding these black box models has become paramount. In response, Explainable AI (XAI) has emerged as a field of research with practical and ethical benefits across various domains. This paper not only highlights the advancements in XAI and its application in real-world scenarios but also addresses the ongoing challenges within XAI, emphasizing the need for broader perspectives and collaborative efforts. We bring together experts from diverse fields to identify open problems, striving to synchronize research agendas and accelerate XAI in practical applications. By fostering collaborative discussion and interdisciplinary cooperation, we aim to propel XAI forward, contributing to its continued success. Our goal is to put forward a comprehensive proposal for advancing XAI. To achieve this goal, we present a manifesto of 27 open problems categorized into nine categories. These challenges encapsulate the complexities and nuances of XAI and offer a road map for future research. For each problem, we provide promising research directions in the hope of harnessing the collective intelligence of interested stakeholders.
66.0AIMay 31
The Case for Model Science: Verify, Explore, Steer, RefinePrzemyslaw Biecek, Luca Longo, Jianlong Zhou et al.
We argue that the AI community is now ready to move beyond benchmarking and consolidate scattered efforts in model analysis into a systematic discipline, a direction we term Model Science. Complex AI models now serve billions of users, yet our understanding of how they work lags far behind our ability to deploy them. Decades of benchmark-driven research have delivered remarkable progress: extensive leaderboards, a wide range of performance metrics, tracking capability gains across diverse tasks; yet this success has also revealed the limits of benchmarks as they tell us whether models perform but not why they succeed or fail, they miss critical failure modes, such as hallucinations or shortcuts. Precedents from established sciences point the way forward: cognitive science shows that understanding complex systems requires complementary levels of analysis; neuroscience demonstrates that deep study of single cases reveals what population studies miss; medicine teaches that specialised training must develop alongside research practice; and agriculture models how shared infrastructure and principles enable cumulative progress. These lessons inform three foundations for Model Science. First, we propose to consolidate research around four functional perspectives: Verify, Explore, Steer, and Refine that address complementary questions about model behaviour. Second, we discuss the required infrastructure for cumulative knowledge: catalogues of datasets, models and findings. Third, we highlight the need for deep analysis of individual model instances, not just model families, because single cases can reveal what population studies miss.
SPSep 21, 2022
Modeling cognitive load as a self-supervised brain rate with electroencephalography and deep learningLuca Longo
The principal reason for measuring mental workload is to quantify the cognitive cost of performing tasks to predict human performance. Unfortunately, a method for assessing mental workload that has general applicability does not exist yet. This research presents a novel self-supervised method for mental workload modelling from EEG data employing Deep Learning and a continuous brain rate, an index of cognitive activation, without requiring human declarative knowledge. This method is a convolutional recurrent neural network trainable with spatially preserving spectral topographic head-maps from EEG data to fit the brain rate variable. Findings demonstrate the capacity of the convolutional layers to learn meaningful high-level representations from EEG data since within-subject models had a test Mean Absolute Percentage Error average of 11%. The addition of a Long-Short Term Memory layer for handling sequences of high-level representations was not significant, although it did improve their accuracy. Findings point to the existence of quasi-stable blocks of learnt high-level representations of cognitive activation because they can be induced through convolution and seem not to be dependent on each other over time, intuitively matching the non-stationary nature of brain responses. Across-subject models, induced with data from an increasing number of participants, thus containing more variability, obtained a similar accuracy to the within-subject models. This highlights the potential generalisability of the induced high-level representations across people, suggesting the existence of subject-independent cognitive activation patterns. This research contributes to the body of knowledge by providing scholars with a novel computational method for mental workload modelling that aims to be generally applicable, does not rely on ad-hoc human-crafted models supporting replicability and falsifiability.
AIJun 28, 2022
Comparing and extending the use of defeasible argumentation with quantitative data in real-world contextsLucas Rizzo, Luca Longo
Dealing with uncertain, contradicting, and ambiguous information is still a central issue in Artificial Intelligence (AI). As a result, many formalisms have been proposed or adapted so as to consider non-monotonicity, with only a limited number of works and researchers performing any sort of comparison among them. A non-monotonic formalism is one that allows the retraction of previous conclusions or claims, from premises, in light of new evidence, offering some desirable flexibility when dealing with uncertainty. This research article focuses on evaluating the inferential capacity of defeasible argumentation, a formalism particularly envisioned for modelling non-monotonic reasoning. In addition to this, fuzzy reasoning and expert systems, extended for handling non-monotonicity of reasoning, are selected and employed as baselines, due to their vast and accepted use within the AI community. Computational trust was selected as the domain of application of such models. Trust is an ill-defined construct, hence, reasoning applied to the inference of trust can be seen as non-monotonic. Inference models were designed to assign trust scalars to editors of the Wikipedia project. In particular, argument-based models demonstrated more robustness than those built upon the baselines despite the knowledge bases or datasets employed. This study contributes to the body of knowledge through the exploitation of defeasible argumentation and its comparison to similar approaches. The practical use of such approaches coupled with a modular design that facilitates similar experiments was exemplified and their respective implementations made publicly available on GitHub [120, 121]. This work adds to previous works, empirically enhancing the generalisability of defeasible argumentation as a compelling approach to reason with quantitative data and uncertain knowledge.
CLJul 16, 2024
GPT Assisted Annotation of Rhetorical and Linguistic Features for Interpretable Propaganda Technique Detection in News TextKyle Hamilton, Luca Longo, Bojan Bozic
While the use of machine learning for the detection of propaganda techniques in text has garnered considerable attention, most approaches focus on "black-box" solutions with opaque inner workings. Interpretable approaches provide a solution, however, they depend on careful feature engineering and costly expert annotated data. Additionally, language features specific to propagandistic text are generally the focus of rhetoricians or linguists, and there is no data set labeled with such features suitable for machine learning. This study codifies 22 rhetorical and linguistic features identified in literature related to the language of persuasion for the purpose of annotating an existing data set labeled with propaganda techniques. To help human experts annotate natural language sentences with these features, RhetAnn, a web application, was specifically designed to minimize an otherwise considerable mental effort. Finally, a small set of annotated data was used to fine-tune GPT-3.5, a generative large language model (LLM), to annotate the remaining data while optimizing for financial cost and classification accuracy. This study demonstrates how combining a small number of human annotated examples with GPT can be an effective strategy for scaling the annotation process at a fraction of the cost of traditional annotation relying solely on human experts. The results are on par with the best performing model at the time of writing, namely GPT-4, at 10x less the cost. Our contribution is a set of features, their properties, definitions, and examples in a machine-readable format, along with the code for RhetAnn and the GPT prompts and fine-tuning procedures for advancing state-of-the-art interpretable propaganda technique detection.
AIJun 27, 2023
A novel structured argumentation framework for improved explainability of classification tasksLucas Rizzo, Luca Longo
This paper presents a novel framework for structured argumentation, named extend argumentative decision graph ($xADG$). It is an extension of argumentative decision graphs built upon Dung's abstract argumentation graphs. The $xADG$ framework allows for arguments to use boolean logic operators and multiple premises (supports) within their internal structure, resulting in more concise argumentation graphs that may be easier for users to understand. The study presents a methodology for construction of $xADGs$ and evaluates their size and predictive capacity for classification tasks of varying magnitudes. Resulting $xADGs$ achieved strong (balanced) accuracy, which was accomplished through an input decision tree, while also reducing the average number of supports needed to reach a conclusion. The results further indicated that it is possible to construct plausibly understandable $xADGs$ that outperform other techniques for building $ADGs$ in terms of predictive capacity and overall size. In summary, the study suggests that $xADG$ represents a promising framework to developing more concise argumentative models that can be used for classification tasks and knowledge discovery, acquisition, and refinement.
13.8LGMar 13
L2GTX: From Local to Global Time Series ExplanationsEphrem Tibebe Mekonnen, Luca Longo, Lucas Rizzo et al.
Deep learning models achieve high accuracy in time series classification, yet understanding their class-level decision behaviour remains challenging. Explanations for time series must respect temporal dependencies and identify patterns that recur across instances. Existing approaches face three limitations: model-agnostic XAI methods developed for images and tabular data do not readily extend to time series, global explanation synthesis for time series remains underexplored, and most existing global approaches are model-specific. We propose L2GTX, a model-agnostic framework that generates class-wise global explanations by aggregating local explanations from a representative set of instances. L2GTX extracts clusters of parameterised temporal event primitives, such as increasing or decreasing trends and local extrema, together with their importance scores from instance-level explanations produced by LOMATCE. These clusters are merged across instances to reduce redundancy, and an instance-cluster importance matrix is used to estimate global relevance. Under a user-defined instance selection budget, L2GTX selects representative instances that maximise coverage of influential clusters. Events from the selected instances are then aggregated into concise class-wise global explanations. Experiments on six benchmark time series datasets show that L2GTX produces compact and interpretable global explanations while maintaining stable global faithfulness measured as mean local surrogate fidelity.
33.2LGMar 19
SHAPCA: Consistent and Interpretable Explanations for Machine Learning Models on Spectroscopy DataMingxing Zhang, Nicola Rossberg, Simone Innocente et al.
In recent years, machine learning models have been increasingly applied to spectroscopic datasets for chemical and biomedical analysis. For their successful adoption, particularly in clinical and safety-critical settings, professionals and researchers must be able to understand and trust the reasoning behind model predictions. However, the inherently high dimensionality and strong collinearity of spectroscopy data pose a fundamental challenge to model explainability. These properties not only complicate model training but also undermine the stability and consistency of explanations, leading to fluctuations in feature importance across repeated training runs. Feature extraction techniques have been used to reduce the input dimensionality; these new features hinder the connection between the prediction and the original signal. This study proposes SHAPCA, an explainable machine learning pipeline that combines Principal Component Analysis (for dimensionality reduction) and Shapely Additive exPlanations (for post hoc explanation) to provide explanations in the original input space, which a practitioner can interpret and link back to the biological components. The proposed framework enables analysis from both global and local perspectives, revealing the spectral bands that drive overall model behaviour as well as the instance-specific features that influence individual predictions. Numerical analysis demonstrated the interpretability of the results and greater consistency across different runs.
LGJul 9, 2025Code
e-Profits: A Business-Aligned Evaluation Metric for Profit-Sensitive Customer Churn PredictionAwais Manzoor, M. Atif Qureshi, Etain Kidney et al.
Retention campaigns in customer relationship management often rely on churn prediction models evaluated using traditional metrics such as AUC and F1-score. However, these metrics fail to reflect financial outcomes and may mislead strategic decisions. We introduce e-Profits, a novel business-aligned evaluation metric that quantifies model performance based on customer-specific value, retention probability, and intervention costs. Unlike existing profit-based metrics such as Expected Maximum Profit, which assume fixed population-level parameters, e-Profits uses Kaplan-Meier survival analysis to estimate personalised retention rates and supports granular, per customer evaluation. We benchmark six classifiers across two telecom datasets (IBM Telco and Maven Telecom) and demonstrate that e-Profits reshapes model rankings compared to traditional metrics, revealing financial advantages in models previously overlooked by AUC or F1-score. The metric also enables segment-level insight into which models maximise return on investment for high-value customers. e-Profits is designed as an understandable, post hoc tool to support model evaluation in business contexts, particularly for marketing and analytics teams prioritising profit-driven decisions. All source code is available at: https://github.com/matifq/eprofits.
AIFeb 24, 2022Code
Is Neuro-Symbolic AI Meeting its Promise in Natural Language Processing? A Structured ReviewKyle Hamilton, Aparna Nayak, Bojan Božić et al.
Advocates for Neuro-Symbolic Artificial Intelligence (NeSy) assert that combining deep learning with symbolic reasoning will lead to stronger AI than either paradigm on its own. As successful as deep learning has been, it is generally accepted that even our best deep learning systems are not very good at abstract reasoning. And since reasoning is inextricably linked to language, it makes intuitive sense that Natural Language Processing (NLP), would be a particularly well-suited candidate for NeSy. We conduct a structured review of studies implementing NeSy for NLP, with the aim of answering the question of whether NeSy is indeed meeting its promises: reasoning, out-of-distribution generalization, interpretability, learning and reasoning from small data, and transferability to new domains. We examine the impact of knowledge representation, such as rules and semantic networks, language structure and relational structure, and whether implicit or explicit reasoning contributes to higher promise scores. We find that systems where logic is compiled into the neural network lead to the most NeSy goals being satisfied, while other factors such as knowledge representation, or type of neural architecture do not exhibit a clear correlation with goals being met. We find many discrepancies in how reasoning is defined, specifically in relation to human level reasoning, which impact decisions about model architectures and drive conclusions which are not always consistent across studies. Hence we advocate for a more methodical approach to the application of theories of human reasoning as well as the development of appropriate benchmarks, which we hope can lead to a better understanding of progress in the field. We make our data and code available on github for further analysis.
30.0LGMar 25
No Single Metric Tells the Whole Story: A Multi-Dimensional Evaluation Framework for Uncertainty AttributionsEmily Schiller, Teodor Chiaburu, Marco Zullich et al.
Research on explainable AI (XAI) has frequently focused on explaining model predictions. More recently, methods have been proposed to explain prediction uncertainty by attributing it to input features (uncertainty attributions). However, the evaluation of these methods remains inconsistent as studies rely on heterogeneous proxy tasks and metrics, hindering comparability. We address this by aligning uncertainty attributions with the well-established Co-12 framework for XAI evaluation. We propose concrete implementations for the correctness, consistency, continuity, and compactness properties. Additionally, we introduce conveyance, a property tailored to uncertainty attributions that evaluates whether controlled increases in epistemic uncertainty reliably propagate to feature-level attributions. We demonstrate our evaluation framework with eight metrics across combinations of uncertainty quantification and feature attribution methods on tabular and image data. Our experiments show that gradient-based methods consistently outperform perturbation-based approaches in consistency and conveyance, while Monte-Carlo dropconnect outperforms Monte-Carlo dropout in most metrics. Although most metrics rank the methods consistently across samples, inter-method agreement remains low. This suggests no single metric sufficiently evaluates uncertainty attribution quality. The proposed evaluation framework contributes to the body of knowledge by establishing a foundation for systematic comparison and development of uncertainty attribution methods.
SDJan 18, 2023
An investigation of the reconstruction capacity of stacked convolutional autoencoders for log-mel-spectrogramsAnastasia Natsiou, Luca Longo, Sean O'Leary
In audio processing applications, the generation of expressive sounds based on high-level representations demonstrates a high demand. These representations can be used to manipulate the timbre and influence the synthesis of creative instrumental notes. Modern algorithms, such as neural networks, have inspired the development of expressive synthesizers based on musical instrument timbre compression. Unsupervised deep learning methods can achieve audio compression by training the network to learn a mapping from waveforms or spectrograms to low-dimensional representations. This study investigates the use of stacked convolutional autoencoders for the compression of time-frequency audio representations for a variety of instruments for a single pitch. Further exploration of hyper-parameters and regularization techniques is demonstrated to enhance the performance of the initial design. In an unsupervised manner, the network is able to reconstruct a monophonic and harmonic sound based on latent representations. In addition, we introduce an evaluation metric to measure the similarity between the original and reconstructed samples. Evaluating a deep generative model for the synthesis of sound is a challenging task. Our approach is based on the accuracy of the generated frequencies as it presents a significant metric for the perception of harmonic sounds. This work is expected to accelerate future experiments on audio compression using neural autoencoders.
20.7LGApr 29
ConformaDecompose: Explaining Uncertainty via Calibration LocalizationFatima Rabia Yapicioglu, Meltem Aksoy, Alberto Rigenti et al.
Conformal Prediction provides distribution-free prediction intervals with guaranteed coverage, but its reliance on a single global calibration threshold obscures the sources of uncertainty at the instance level. In particular, it conflates irreducible noise with uncertainty induced by heterogeneous training data (aleatoric), model limitations, or calibration mismatch (epistemic), offering little insight into why an interval is wide or whether it could be reduced. We introduce an uncertainty-aware explainability framework that analyses the reducibility of calibration-induced epistemic conformal uncertainty via progressive calibration localisation for regression tasks. The approach is diagnostic rather than causal: it does not estimate true aleatoric or epistemic uncertainty, but explains how conformal intervals contract and stabilise as calibration support is localised around a test instance. Across benchmarks and real-world data, absolute reducible uncertainty aligns with epistemic proxies, while its relative contribution varies by task, revealing regimes hidden by interval width. This instance-level view complements conformal uncertainty, enhancing interpretability without altering the predictor or coverage.
2.0LGApr 29
Interpretable EEG Microstate Discovery via Variational Deep Embedding: A Systematic Architecture Search with Multi-Quadrant EvaluationSaheed Faremi, Andrea Visentin, Luca Longo
EEG microstate analysis segments continuous brain electrical activity into brief, quasi-stable topographic configurations that reflect discrete functional brain states. Conventional approaches such as Modified K-Means operate directly in electrode space with hard assignment, offering no learned latent representation, no generative decoder, and no mechanism to decode latent configurations into verifiable scalp topographies, limiting both model transparency and interpretability. To address this, we present a Convolutional Variational Deep Embedding (Conv-VaDE) model that jointly learns topographic reconstruction and probabilistic soft clustering in a shared latent space. Conv-VaDE enables generative decoding of cluster prototypes into verifiable scalp topographies, replacing opaque hard partitioning with probabilistic soft assignment. A polarity invariance scheme and a four-dimensional grid search over cluster count (K from 3 to 20), latent dimensionality, network depth, and channel width are conducted to systematically reveal how each architectural design choice shapes the quality, stability, and interpretability of learned EEG microstate representations. The model is evaluated on the LEMON resting-state eyes-closed EEG dataset with ten participants using topographic template formation, clustering stability, and global explained variance (GEV). The architecture search reveals that depth L = 4 appears consistently across all 18 best-performing configurations, yielding a best-case GEV of 0.730 and a silhouette of 0.229 at K = 4 across the model sweeps, where moderately deep networks with compact channel widths and small latent dimensionality dominate across the full K range. These results establish that principled architecture search, rather than model scale, is the key to interpretable and stable EEG microstate discovery via variational deep embedding.
AIApr 16, 2024
CNN-based explanation ensembling for dataset, representation and explanations evaluationWeronika Hryniewska-Guzik, Luca Longo, Przemysław Biecek
Explainable Artificial Intelligence has gained significant attention due to the widespread use of complex deep learning models in high-stake domains such as medicine, finance, and autonomous cars. However, different explanations often present different aspects of the model's behavior. In this research manuscript, we explore the potential of ensembling explanations generated by deep classification models using convolutional model. Through experimentation and analysis, we aim to investigate the implications of combining explanations to uncover a more coherent and reliable patterns of the model's behavior, leading to the possibility of evaluating the representation learned by the model. With our method, we can uncover problems of under-representation of images in a certain class. Moreover, we discuss other side benefits like features' reduction by replacing the original image with its explanations resulting in the removal of some sensitive information. Through the use of carefully selected evaluation metrics from the Quantus library, we demonstrated the method's superior performance in terms of Localisation and Faithfulness, compared to individual explanations.
51.6HCApr 7
Improving Explanations: Applying the Feature Understandability Scale for Cost-Sensitive Feature SelectionNicola Rossberg, Bennett Kleinberg, Barry O'Sullivan et al.
With the growing pervasiveness of artificial intelligence, the ability to explain the inferences made by machine learning models has become increasingly important. Numerous techniques for model explainability have been proposed, with natural-language textual explanations among the most widely used approaches. When applied to tabular data, these explanations typically draw on input features to justify a given inference. Consequently, a user's ability to interpret the explanation depends on their understanding of the input features. To quantify this feature-level understanding, Rossberg et al. introduced the Feature Understandability Scale. Building on that work, this proof-of-concept study collects understandability scores across two datasets, proposes a co-optimisation methodology of understandability and accuracy and presents the resulting explanations alongside the model accuracies. This work contributes to the body of knowledge on model interpretability by design. It is found that accuracy and understandability can be successfully co-optimised while maintaining high classification performances. The resulting explanations are considered more understandable at face value. Further research will aim to confirm these findings through user evaluation.
AIFeb 5, 2025
CORTEX: A Cost-Sensitive Rule and Tree Extraction MethodMarija Kopanja, Miloš Savić, Luca Longo
Tree-based and rule-based machine learning models play pivotal roles in explainable artificial intelligence (XAI) due to their unique ability to provide explanations in the form of tree or rule sets that are easily understandable and interpretable, making them essential for applications in which trust in model decisions is necessary. These transparent models are typically used in surrogate modeling, a post-hoc XAI approach for explaining the logic of black-box models, enabling users to comprehend and trust complex predictive systems while maintaining competitive performance. This study proposes the Cost-Sensitive Rule and Tree Extraction (CORTEX) method, a novel rule-based XAI algorithm grounded in the multi-class cost-sensitive decision tree (CSDT) method. The original version of the CSDT is extended to classification problems with more than two classes by inducing the concept of an n-dimensional class-dependent cost matrix. The performance of CORTEX as a rule-extractor XAI method is compared to other post-hoc tree and rule extraction methods across several datasets with different numbers of classes. Several quantitative evaluation metrics are employed to assess the explainability of generated rule sets. Our findings demonstrate that CORTEX is competitive with other tree-based methods and can be superior to other rule-based methods across different datasets. The extracted rule sets suggest the advantages of using the CORTEX method over other methods by producing smaller rule sets with shorter rules on average across datasets with a diverse number of classes. Overall, the results underscore the potential of CORTEX as a powerful XAI tool for scenarios that require the generation of clear, human-understandable rules while maintaining good predictive performance.
SPAug 31, 2025
PyNoetic: A modular python framework for no-code development of EEG brain-computer interfacesGursimran Singh, Aviral Chharia, Rahul Upadhyay et al.
Electroencephalography (EEG)-based Brain-Computer Interfaces (BCIs) have emerged as a transformative technology with applications spanning robotics, virtual reality, medicine, and rehabilitation. However, existing BCI frameworks face several limitations, including a lack of stage-wise flexibility essential for experimental research, steep learning curves for researchers without programming expertise, elevated costs due to reliance on proprietary software, and a lack of all-inclusive features leading to the use of multiple external tools affecting research outcomes. To address these challenges, we present PyNoetic, a modular BCI framework designed to cater to the diverse needs of BCI research. PyNoetic is one of the very few frameworks in Python that encompasses the entire BCI design pipeline, from stimulus presentation and data acquisition to channel selection, filtering, feature extraction, artifact removal, and finally simulation and visualization. Notably, PyNoetic introduces an intuitive and end-to-end GUI coupled with a unique pick-and-place configurable flowchart for no-code BCI design, making it accessible to researchers with minimal programming experience. For advanced users, it facilitates the seamless integration of custom functionalities and novel algorithms with minimal coding, ensuring adaptability at each design stage. PyNoetic also includes a rich array of analytical tools such as machine learning models, brain-connectivity indices, systematic testing functionalities via simulation, and evaluation methods of novel paradigms. PyNoetic's strengths lie in its versatility for both offline and real-time BCI development, which streamlines the design process, allowing researchers to focus on more intricate aspects of BCI development and thus accelerate their research endeavors. Project Website: https://neurodiag.github.io/PyNoetic
AIMay 6, 2025
Extending Decision Predicate Graphs for Comprehensive Explanation of Isolation ForestMatteo Ceschin, Leonardo Arrighi, Luca Longo et al.
The need to explain predictive models is well-established in modern machine learning. However, beyond model interpretability, understanding pre-processing methods is equally essential. Understanding how data modifications impact model performance improvements and potential biases and promoting a reliable pipeline is mandatory for developing robust machine learning solutions. Isolation Forest (iForest) is a widely used technique for outlier detection that performs well. Its effectiveness increases with the number of tree-based learners. However, this also complicates the explanation of outlier selection and the decision boundaries for inliers. This research introduces a novel Explainable AI (XAI) method, tackling the problem of global explainability. In detail, it aims to offer a global explanation for outlier detection to address its opaque nature. Our approach is based on the Decision Predicate Graph (DPG), which clarifies the logic of ensemble methods and provides both insights and a graph-based metric to explain how samples are identified as outliers using the proposed Inlier-Outlier Propagation Score (IOP-Score). Our proposal enhances iForest's explainability and provides a comprehensive view of the decision-making process, detailing which features contribute to outlier identification and how the model utilizes them. This method advances the state-of-the-art by providing insights into decision boundaries and a comprehensive view of holistic feature usage in outlier identification. -- thus promoting a fully explainable machine learning pipeline.
SPFeb 22, 2022
An Evaluation of the EEG alpha-to-theta and theta-to-alpha band Ratios as Indexes of Mental WorkloadBujar Raufi, Luca Longo
Many research works indicate that EEG bands, specifically the alpha and theta bands, have been potentially helpful cognitive load indicators. However, minimal research exists to validate this claim. This study aims to assess and analyze the impact of the alpha-to-theta and the theta-to-alpha band ratios on supporting the creation of models capable of discriminating self-reported perceptions of mental workload. A dataset of raw EEG data was utilized in which 48 subjects performed a resting activity and an induced task demanding exercise in the form of a multitasking SIMKAP test. Band ratios were devised from frontal and parietal electrode clusters. Building and model testing was done with high-level independent features from the frequency and temporal domains extracted from the computed ratios over time. Target features for model training were extracted from the subjective ratings collected after resting and task demand activities. Models were built by employing Logistic Regression, Support Vector Machines and Decision Trees and were evaluated with performance measures including accuracy, recall, precision and f1-score. The results indicate high classification accuracy of those models trained with the high-level features extracted from the alpha-to-theta ratios and theta-to-alpha ratios. Preliminary results also show that models trained with logistic regression and support vector machines can accurately classify self-reported perceptions of mental workload. This research contributes to the body of knowledge by demonstrating the richness of the information in the temporal, spectral and statistical domains extracted from the alpha-to-theta and theta-to-alpha EEG band ratios for the discrimination of self-reported perceptions of mental workload.
AIMay 29, 2020
Explainable Artificial Intelligence: a Systematic ReviewGiulia Vilone, Luca Longo
Explainable Artificial Intelligence (XAI) has experienced a significant growth over the last few years. This is due to the widespread application of machine learning, particularly deep learning, that has led to the development of highly accurate models but lack explainability and interpretability. A plethora of methods to tackle this problem have been proposed, developed and tested. This systematic review contributes to the body of knowledge by clustering these methods with a hierarchical classification system with four main clusters: review articles, theories and notions, methods and their evaluation. It also summarises the state-of-the-art in XAI and recommends future research directions.
AIDec 20, 2017
Pseudorehearsal in actor-critic agents with neural network function approximationVladimir Marochko, Leonard Johard, Manuel Mazzara et al.
Catastrophic forgetting has a significant negative impact in reinforcement learning. The purpose of this study is to investigate how pseudorehearsal can change performance of an actor-critic agent with neural-network function approximation. We tested agent in a pole balancing task and compared different pseudorehearsal approaches. We have found that pseudorehearsal can assist learning and decrease forgetting.