AIJul 20, 2023
Modifications of the Miller definition of contrastive (counterfactual) explanationsKevin McAreavey, Weiru Liu
Miller recently proposed a definition of contrastive (counterfactual) explanations based on the well-known Halpern-Pearl (HP) definitions of causes and (non-contrastive) explanations. Crucially, the Miller definition was based on the original HP definition of explanations, but this has since been modified by Halpern; presumably because the original yields counterintuitive results in many standard examples. More recently Borner has proposed a third definition, observing that this modified HP definition may also yield counterintuitive results. In this paper we show that the Miller definition inherits issues found in the original HP definition. We address these issues by proposing two improved variants based on the more robust modified HP and Borner definitions. We analyse our new definitions and show that they retain the spirit of the Miller definition where all three variants satisfy an alternative unified definition that is modular with respect to an underlying definition of non-contrastive explanations. To the best of our knowledge this paper also provides the first explicit comparison between the original and modified HP definitions.
AISep 24, 2024
TSFeatLIME: An Online User Study in Enhancing Explainability in Univariate Time Series ForecastingHongnan Ma, Kevin McAreavey, Weiru Liu
Time series forecasting, while vital in various applications, often employs complex models that are difficult for humans to understand. Effective explainable AI techniques are crucial to bridging the gap between model predictions and user understanding. This paper presents a framework - TSFeatLIME, extending TSLIME, tailored specifically for explaining univariate time series forecasting. TSFeatLIME integrates an auxiliary feature into the surrogate model and considers the pairwise Euclidean distances between the queried time series and the generated samples to improve the fidelity of the surrogate models. However, the usefulness of such explanations for human beings remains an open question. We address this by conducting a user study with 160 participants through two interactive interfaces, aiming to measure how individuals from different backgrounds can simulate or predict model output changes in the treatment group and control group. Our results show that the surrogate model under the TSFeatLIME framework is able to better simulate the behaviour of the black-box considering distance, without sacrificing accuracy. In addition, the user study suggests that the explanations were significantly more effective for participants without a computer science background.
AISep 20, 2024
A User Study on Contrastive Explanations for Multi-Effector Temporal Planning with Non-Stationary CostsXiaowei Liu, Kevin McAreavey, Weiru Liu
In this paper, we adopt constrastive explanations within an end-user application for temporal planning of smart homes. In this application, users have requirements on the execution of appliance tasks, pay for energy according to dynamic energy tariffs, have access to high-capacity battery storage, and are able to sell energy to the grid. The concurrent scheduling of devices makes this a multi-effector planning problem, while the dynamic tariffs yield costs that are non-stationary (alternatively, costs that are stationary but depend on exogenous events). These characteristics are such that the planning problems are generally not supported by existing PDDL-based planners, so we instead design a custom domain-dependent planner that scales to reasonable appliance numbers and time horizons. We conduct a controlled user study with 128 participants using an online crowd-sourcing platform based on two user stories. Our results indicate that users provided with contrastive questions and explanations have higher levels of satisfaction, tend to gain improved understanding, and rate the helpfulness more favourably with the recommended AI schedule compared to those without access to these features.
AIAug 5, 2024
Explaining Reinforcement Learning: A Counterfactual Shapley Values ApproachYiwei Shi, Qi Zhang, Kevin McAreavey et al.
This paper introduces a novel approach Counterfactual Shapley Values (CSV), which enhances explainability in reinforcement learning (RL) by integrating counterfactual analysis with Shapley Values. The approach aims to quantify and compare the contributions of different state dimensions to various action choices. To more accurately analyze these impacts, we introduce new characteristic value functions, the ``Counterfactual Difference Characteristic Value" and the ``Average Counterfactual Difference Characteristic Value." These functions help calculate the Shapley values to evaluate the differences in contributions between optimal and non-optimal actions. Experiments across several RL domains, such as GridWorld, FrozenLake, and Taxi, demonstrate the effectiveness of the CSV method. The results show that this method not only improves transparency in complex RL systems but also quantifies the differences across various decisions.
ROApr 7, 2021Code
On Determinism of Game Engines used for Simulation-based Autonomous Vehicle VerificationGreg Chance, Abanoub Ghobrial, Kevin McAreavey et al.
Game engines are increasingly used as simulation platforms by the autonomous vehicle (AV) community to develop vehicle control systems and test environments. A key requirement for simulation-based development and verification is determinism, since a deterministic process will always produce the same output given the same initial conditions and event history. Thus, in a deterministic simulation environment, tests are rendered repeatable and yield simulation results that are trustworthy and straightforward to debug. However, game engines are seldom deterministic. This paper reviews and identifies the potential causes of non-deterministic behaviours in game engines. A case study using CARLA, an open-source autonomous driving simulation environment powered by Unreal Engine, is presented to highlight its inherent shortcomings in providing sufficient precision in experimental results. Different configurations and utilisations of the software and hardware are explored to determine an operational domain where the simulation precision is sufficiently low i.e.\ variance between repeated executions becomes negligible for development and testing work. Finally, a method of a general nature is proposed, that can be used to find the domains of permissible variance in game engine simulations for any given system configuration.