46.3CLJun 2
SagaQA: A Multi-hop Reasoning Benchmark for Long-form Narrative Understanding in TV SeriesGalann Pennec, Zhengyuan Liu, Nicholas Asher et al.
We introduce SagaQA, a long-form video benchmark for multi-hop reasoning over full-length TV series. Existing video reasoning benchmarks often emphasize local understanding of adjacent frames or clips. SagaQA addresses this gap by requiring high-level comprehension of extended multimodal narratives in entire TV shows. A distinguishing feature of SagaQA is the granularity of its reasoning steps. Our dataset necessitates long-range reasoning hops to connect information across completely different episodes. This requires models to reason over entire events and actions, demanding a deep understanding of the show's narration and progression at a multimodal level. Motivated by recent progress in agentic methods, we further study how different planning strategies handle such complex reasoning. We categorize these approaches into three classes-Parallel, Sequential, and Hybrid planners-and evaluate their ability to generate coherent and complete reasoning plans. Our results on SagaQA suggest that hybrid planners consistently produce higher-quality plans and exhibit stronger capabilities for complex, high-level narrative understanding in TV shows.
LGAug 7, 2024Code
In2Core: Leveraging Influence Functions for Coreset Selection in Instruction Finetuning of Large Language ModelsAyrton San Joaquin, Bin Wang, Zhengyuan Liu et al.
Despite advancements, fine-tuning Large Language Models (LLMs) remains costly due to the extensive parameter count and substantial data requirements for model generalization. Accessibility to computing resources remains a barrier for the open-source community. To address this challenge, we propose the In2Core algorithm, which selects a coreset by analyzing the correlation between training and evaluation samples with a trained model. Notably, we assess the model's internal gradients to estimate this relationship, aiming to rank the contribution of each training point. To enhance efficiency, we propose an optimization to compute influence functions with a reduced number of layers while achieving similar accuracy. By applying our algorithm to instruction fine-tuning data of LLMs, we can achieve similar performance with just 50% of the training data. Meantime, using influence functions to analyze model coverage to certain testing samples could provide a reliable and interpretable signal on the training set's coverage of those test points.
CLDec 21, 2022
Analyzing Semantic Faithfulness of Language Models via Input Intervention on Question AnsweringAkshay Chaturvedi, Swarnadeep Bhar, Soumadeep Saha et al.
Transformer-based language models have been shown to be highly effective for several NLP tasks. In this paper, we consider three transformer models, BERT, RoBERTa, and XLNet, in both small and large versions, and investigate how faithful their representations are with respect to the semantic content of texts. We formalize a notion of semantic faithfulness, in which the semantic content of a text should causally figure in a model's inferences in question answering. We then test this notion by observing a model's behavior on answering questions about a story after performing two novel semantic interventions: deletion intervention and negation intervention. While transformer models achieve high performance on standard question answering tasks, we show that they fail to be semantically faithful once we perform these interventions for a significant number of cases (~50% for deletion intervention, and ~20% drop in accuracy for negation intervention). We then propose an intervention-based training regime that can mitigate the undesirable effects for deletion intervention by a significant margin (from ~ 50% to ~6%). We analyze the inner-workings of the models to better understand the effectiveness of intervention-based training for deletion intervention. But we show that this training does not attenuate other aspects of semantic unfaithfulness such as the models' inability to deal with negation intervention or to capture the predicate-argument structure of texts. We also test InstructGPT, via prompting, for its ability to handle the two interventions and to capture predicate-argument structure. While InstructGPT models do achieve very high performance on predicate-argument structure task, they fail to respond adequately to our deletion and negation interventions.
CLJun 21, 2023
Limits for Learning with Language ModelsNicholas Asher, Swarnadeep Bhar, Akshay Chaturvedi et al.
With the advent of large language models (LLMs), the trend in NLP has been to train LLMs on vast amounts of data to solve diverse language understanding and generation tasks. The list of LLM successes is long and varied. Nevertheless, several recent papers provide empirical evidence that LLMs fail to capture important aspects of linguistic meaning. Focusing on universal quantification, we provide a theoretical foundation for these empirical findings by proving that LLMs cannot learn certain fundamental semantic properties including semantic entailment and consistency as they are defined in formal semantics. More generally, we show that LLMs are unable to learn concepts beyond the first level of the Borel Hierarchy, which imposes severe limits on the ability of LMs, both large and small, to capture many aspects of linguistic meaning. This means that LLMs will continue to operate without formal guarantees on tasks that require entailments and deep linguistic understanding.
LGFeb 27, 2023
How optimal transport can tackle gender biases in multi-class neural-network classifiers for job recommendations?Fanny Jourdan, Titon Tshiongo Kaninku, Nicholas Asher et al.
Automatic recommendation systems based on deep neural networks have become extremely popular during the last decade. Some of these systems can however be used for applications which are ranked as High Risk by the European Commission in the A.I. act, as for instance for online job candidate recommendation. When used in the European Union, commercial AI systems for this purpose will then be required to have to proper statistical properties with regard to potential discrimination they could engender. This motivated our contribution, where we present a novel optimal transport strategy to mitigate undesirable algorithmic biases in multi-class neural-network classification. Our stratey is model agnostic and can be used on any multi-class classification neural-network model. To anticipate the certification of recommendation systems using textual data, we then used it on the Bios dataset, for which the learning task consists in predicting the occupation of female and male individuals, based on their LinkedIn biography. Results show that it can reduce undesired algorithmic biases in this context to lower levels than a standard strategy.
CLJun 8, 2023
Are fairness metric scores enough to assess discrimination biases in machine learning?Fanny Jourdan, Laurent Risser, Jean-Michel Loubes et al.
This paper presents novel experiments shedding light on the shortcomings of current metrics for assessing biases of gender discrimination made by machine learning algorithms on textual data. We focus on the Bios dataset, and our learning task is to predict the occupation of individuals, based on their biography. Such prediction tasks are common in commercial Natural Language Processing (NLP) applications such as automatic job recommendations. We address an important limitation of theoretical discussions dealing with group-wise fairness metrics: they focus on large datasets, although the norm in many industrial NLP applications is to use small to reasonably large linguistic datasets for which the main practical constraint is to get a good prediction accuracy. We then question how reliable are different popular measures of bias when the size of the training set is simply sufficient to learn reasonably accurate predictions. Our experiments sample the Bios dataset and learn more than 200 models on different sample sizes. This allows us to statistically study our results and to confirm that common gender bias indices provide diverging and sometimes unreliable results when applied to relatively small training and test samples. This highlights the crucial importance of variance calculations for providing sound results in this field.
CLDec 12, 2025
Minimal Clips, Maximum Salience: Long Video Summarization via Key Moment ExtractionGalann Pennec, Zhengyuan Liu, Nicholas Asher et al.
Vision-Language Models (VLMs) are able to process increasingly longer videos. Yet, important visual information is easily lost throughout the entire context and missed by VLMs. Also, it is important to design tools that enable cost-effective analysis of lengthy video content. In this paper, we propose a clip selection method that targets key video moments to be included in a multimodal summary. We divide the video into short clips and generate compact visual descriptions of each using a lightweight video captioning model. These are then passed to a large language model (LLM), which selects the K clips containing the most relevant visual information for a multimodal summary. We evaluate our approach on reference clips for the task, automatically derived from full human-annotated screenplays and summaries in the MovieSum dataset. We further show that these reference clips (less than 6% of the movie) are sufficient to build a complete multimodal summary of the movies in MovieSum. Using our clip selection method, we achieve a summarization performance close to that of these reference clips while capturing substantially more relevant video information than random clip selection. Importantly, we maintain low computational cost by relying on a lightweight captioning model.
64.4LGMay 14
TAPIOCA: Why Task- Aware Pruning Improves OOD model CapabilityKrish Sharma, Omar Naim, Soumadeep Saha et al.
Recent work has promoted task-aware layer pruning as a way to improve model performance on particular tasks, as shown by TALE. In this paper, we investigate when such improvements occur and why. We show first that, across controlled polynomial regression tasks and large language models, such pruning yields no benefit on in-distribution (ID) data but consistently improves out-of-distribution (OOD) accuracy. We further show empirically that OOD inputs induce layerwise norm and pairwise-distance profiles that deviate from the corresponding ID profiles. This leads to a geometric explanation of task-aware pruning: each task induces a task-adapted geometry, characterized empirically by the representation profiles observed on ID inputs. OOD inputs can introduce a distorted version of the task-adapted geometry. Task-aware pruning identifies layers that create or amplify this distortion; by removing them, it shifts OOD representational norms and pairwise distances toward those observed on the adapted distribution. This realigns OOD inputs with the model's task-adapted geometry and improves performance. We provide causal evidence through controlled distribution shifts and residual-scaling interventions, and demonstrate consistent behavior across model scales.
CLDec 11, 2023Code
TaCo: Targeted Concept Erasure Prevents Non-Linear Classifiers From Detecting Protected AttributesFanny Jourdan, Louis Béthune, Agustin Picard et al.
Ensuring fairness in NLP models is crucial, as they often encode sensitive attributes like gender and ethnicity, leading to biased outcomes. Current concept erasure methods attempt to mitigate this by modifying final latent representations to remove sensitive information without retraining the entire model. However, these methods typically rely on linear classifiers, which leave models vulnerable to non-linear adversaries capable of recovering sensitive information. We introduce Targeted Concept Erasure (TaCo), a novel approach that removes sensitive information from final latent representations, ensuring fairness even against non-linear classifiers. Our experiments show that TaCo outperforms state-of-the-art methods, achieving greater reductions in the prediction accuracy of sensitive attributes by non-linear classifier while preserving overall task performance. Code is available on https://github.com/fanny-jourdan/TaCo.
AIMar 2
Exploring Plan Space through Conversation: An Agentic Framework for LLM-Mediated Explanations in PlanningGuilhem Fouilhé, Rebecca Eifler, Antonin Poché et al.
When automating plan generation for a real-world sequential decision problem, the goal is often not to replace the human planner, but to facilitate an iterative reasoning and elicitation process, where the human's role is to guide the AI planner according to their preferences and expertise. In this context, explanations that respond to users' questions are crucial to improve their understanding of potential solutions and increase their trust in the system. To enable natural interaction with such a system, we present a multi-agent Large Language Model (LLM) architecture that is agnostic to the explanation framework and enables user- and context-dependent interactive explanations. We also describe an instantiation of this framework for goal-conflict explanations, which we use to conduct a user study comparing the LLM-powered interaction with a baseline template-based explanation interface.
AIMay 20, 2024
From SHAP Scores to Feature Importance ScoresOlivier Letoffe, Xuanxiang Huang, Nicholas Asher et al.
A central goal of eXplainable Artificial Intelligence (XAI) is to assign relative importance to the features of a Machine Learning (ML) model given some prediction. The importance of this task of explainability by feature attribution is illustrated by the ubiquitous recent use of tools such as SHAP and LIME. Unfortunately, the exact computation of feature attributions, using the game-theoretical foundation underlying SHAP and LIME, can yield manifestly unsatisfactory results, that tantamount to reporting misleading relative feature importance. Recent work targeted rigorous feature attribution, by studying axiomatic aggregations of features based on logic-based definitions of explanations by feature selection. This paper shows that there is an essential relationship between feature attribution and a priori voting power, and that those recently proposed axiomatic aggregations represent a few instantiations of the range of power indices studied in the past. Furthermore, it remains unclear how some of the most widely used power indices might be exploited as feature importance scores (FISs), i.e. the use of power indices in XAI, and which of these indices would be the best suited for the purposes of XAI by feature attribution, namely in terms of not producing results that could be deemed as unsatisfactory. This paper proposes novel desirable properties that FISs should exhibit. In addition, the paper also proposes novel FISs exhibiting the proposed properties. Finally, the paper conducts a rigorous analysis of the best-known power indices in terms of the proposed properties.
CLFeb 16, 2024
Strong hallucinations from negation and how to fix themNicholas Asher, Swarnadeep Bhar
Despite great performance on many tasks, language models (LMs) still struggle with reasoning, sometimes providing responses that cannot possibly be true because they stem from logical incoherence. We call such responses \textit{strong hallucinations} and prove that they follow from an LM's computation of its internal representations for logical operators and outputs from those representations. Focusing on negation, we provide a novel solution in which negation is treated not as another element of a latent representation, but as \textit{an operation over an LM's latent representations that constrains how they may evolve}. We show that our approach improves model performance in cloze prompting and natural language inference tasks with negation without requiring training on sparse negative data.
LGNov 18, 2024
Re-examining learning linear functions in contextOmar Naim, Guilhem Fouilhé, Nicholas Asher
In-context learning (ICL) has emerged as a powerful paradigm for easily adapting Large Language Models (LLMs) to various tasks. However, our understanding of how ICL works remains limited. We explore a simple model of ICL in a controlled setup with synthetic training data to investigate ICL of univariate linear functions. We experiment with a range of GPT-2-like transformer models trained from scratch. Our findings challenge the prevailing narrative that transformers adopt algorithmic approaches like linear regression to learn a linear function in-context. These models fail to generalize beyond their training distribution, highlighting fundamental limitations in their capacity to infer abstract task structures. Our experiments lead us to propose a mathematically precise hypothesis of what the model might be learning.
CVMar 18, 2024
Modality-Agnostic fMRI Decoding of Vision and LanguageMitja Nikolaus, Milad Mozafari, Nicholas Asher et al.
Previous studies have shown that it is possible to map brain activation data of subjects viewing images onto the feature representation space of not only vision models (modality-specific decoding) but also language models (cross-modal decoding). In this work, we introduce and use a new large-scale fMRI dataset (~8,500 trials per subject) of people watching both images and text descriptions of such images. This novel dataset enables the development of modality-agnostic decoders: a single decoder that can predict which stimulus a subject is seeing, irrespective of the modality (image or text) in which the stimulus is presented. We train and evaluate such decoders to map brain signals onto stimulus representations from a large range of publicly available vision, language and multimodal (vision+language) models. Our findings reveal that (1) modality-agnostic decoders perform as well as (and sometimes even better than) modality-specific decoders (2) modality-agnostic decoders mapping brain data onto representations from unimodal models perform as well as decoders relying on multimodal representations (3) while language and low-level visual (occipital) brain regions are best at decoding text and image stimuli, respectively, high-level visual (temporal) regions perform well on both stimulus types.
CLOct 24, 2024
On Explaining with Attention MatricesOmar Naim, Nicholas Asher
This paper explores the much discussed, possible explanatory link between attention weights (AW) in transformer models and predicted output. Contrary to intuition and early research on attention, more recent prior research has provided formal arguments and empirical evidence that AW are not explanatorily relevant. We show that the formal arguments are incorrect. We introduce and effectively compute efficient attention, which isolates the effective components of attention matrices in tasks and models in which AW play an explanatory role. We show that efficient attention has a causal role (provides minimally necessary and sufficient conditions) for predicting model output in NLP tasks requiring contextual information, and we show, contrary to [7], that efficient attention matrices are probability distributions and are effectively calculable. Thus, they should play an important part in the explanation of attention based model behavior. We offer empirical experiments in support of our method illustrating various properties of efficient attention with various metrics on four datasets.
CLMar 6, 2025
DIMSUM: Discourse in Mathematical Reasoning as a Supervision ModuleKrish Sharma, Niyar R Barman, Akshay Chaturvedi et al.
We look at reasoning on GSM8k, a dataset of short texts presenting primary school, math problems. We find, with Mirzadeh et al. (2024), that current LLM progress on the data set may not be explained by better reasoning but by exposure to a broader pretraining data distribution. We then introduce a novel information source for helping models with less data or inferior training reason better: discourse structure. We show that discourse structure improves performance for models like Llama2 13b by up to 160%. Even for models that have most likely memorized the data set, adding discourse structural information to the model still improves predictions and dramatically improves large model performance on out of distribution examples.
MLFeb 5, 2025
Analyzing limits for in-context learningOmar Naim, Jerome Bolte, Nicholas Asher
Our paper challenges claims from prior research that transformer-based models, when learning in context, implicitly implement standard learning algorithms. We present empirical evidence inconsistent with this view and provide a mathematical analysis demonstrating that transformers cannot achieve general predictive accuracy due to inherent architectural limitations.
LGOct 26, 2025
TELL-TALE: Task Efficient LLMs with Task Aware Layer EliminationOmar Naim, Krish Sharma, Nicholas Asher
In this paper we introduce Tale, Task-Aware Layer Elimination, an inference-time algorithm that prunes entire transformer layers in an LLM by directly optimizing task-specific validation performance. We evaluate TALE on 9 tasks and 5 models, including LLaMA 3.1 8B, Qwen 2.5 7B, Qwen 2.5 0.5B, Mistral 7B, and Lucie 7B, under both zero-shot and few-shot settings. Unlike prior approaches, TALE requires no retraining and consistently improves accuracy while reducing computational cost across all benchmarks. Furthermore, applying TALE during finetuning leads to additional performance gains. Finally, TALE provides flexible user control over trade-offs between accuracy and efficiency. Mutual information analysis shows that certain layers act as bottlenecks, degrading task-relevant representations. Tale's selective layer removal remedies this problem, producing smaller, faster, and more accurate models that are also faster to fine-tune while offering new insights into transformer interpretability.
CLAug 29, 2025
COCORELI: Cooperative, Compositional Reconstitution \& Execution of Language InstructionsSwarnadeep Bhar, Omar Naim, Eleni Metheniti et al.
We present COCORELI, a hybrid agent framework designed to tackle the limitations of large language models (LLMs) in tasks requiring: following complex instructions, minimizing hallucination, and spatial reasoning. COCORELI integrates medium-sized LLM agents with novel abstraction mechanisms and a discourse module to parse instructions to in-context learn dynamic, high-level representations of the environment. Experiments on natural collaborative construction tasks show that COCORELI outperforms single-LLM CoT and agentic LLM systems, all using larger LLMs. It manages to largely avoid hallucinations, identify missing information, ask for clarifications, and update its learned objects. COCORELI's abstraction abilities extend beyond ENVIRONMENT, as shown in the ToolBench API completion task.
CLAug 20, 2025
Scaled Signed Averaging Improves In-Context and Early Learning Benchmark Performance in Small TransformersOmar Naim, Swarnadeep Bhar, Jérôme Bolte et al.
While Large Language models' abilities for in-context learning (ICL) have drawn much attention, we examine some of its limitations on semantic tasks involving quantifiers like "all" and "some", as well as on tasks with linear functions. We identify Softmax, the scoring function in attention mechanism, as a contributing factor to these limitations. We propose scaled signed averaging (SSA), a novel alternative to Softmax to mitigate these problems. We show that SSA significantly improves performance on our ICL tasks. In addition, SSA outperforms transformer models with Softmax on several early learning NLP benchmarks and linguistic probing tasks on zero and few-shot settings.
CLJul 15, 2025
KisMATH: Do LLMs Have Knowledge of Implicit Structures in Mathematical Reasoning?Soumadeep Saha, Akshay Chaturvedi, Saptarshi Saha et al.
Chain-of-thought traces have been shown to improve performance of large language models in a plethora of reasoning tasks, yet there is no consensus on the mechanism through which this performance boost is achieved. To shed more light on this, we introduce Causal CoT Graphs (CCGs), which are directed acyclic graphs automatically extracted from reasoning traces that model fine-grained causal dependencies in the language model output. A collection of $1671$ mathematical reasoning problems from MATH500, GSM8K and AIME, and their associated CCGs are compiled into our dataset -- \textbf{KisMATH}. Our detailed empirical analysis with 15 open-weight LLMs shows that (i) reasoning nodes in the CCG are mediators for the final answer, a condition necessary for reasoning; and (ii) LLMs emphasise reasoning paths given by the CCG, indicating that models internally realise structures akin to our graphs. KisMATH enables controlled, graph-aligned interventions and opens up avenues for further investigation into the role of chain-of-thought in LLM reasoning.
CLMay 10, 2025
Integrating Video and Text: A Balanced Approach to Multimodal Summary Generation and EvaluationGalann Pennec, Zhengyuan Liu, Nicholas Asher et al.
Vision-Language Models (VLMs) often struggle to balance visual and textual information when summarizing complex multimodal inputs, such as entire TV show episodes. In this paper, we propose a zero-shot video-to-text summarization approach that builds its own screenplay representation of an episode, effectively integrating key video moments, dialogue, and character information into a unified document. Unlike previous approaches, we simultaneously generate screenplays and name the characters in zero-shot, using only the audio, video, and transcripts as input. Additionally, we highlight that existing summarization metrics can fail to assess the multimodal content in summaries. To address this, we introduce MFactSum, a multimodal metric that evaluates summaries with respect to both vision and text modalities. Using MFactSum, we evaluate our screenplay summaries on the SummScreen3D dataset, demonstrating superiority against state-of-the-art VLMs such as Gemini 1.5 by generating summaries containing 20% more relevant visual information while requiring 75% less of the video as input.
CLJun 26, 2024
Llamipa: An Incremental Discourse ParserKate Thompson, Akshay Chaturvedi, Julie Hunter et al.
This paper provides the first discourse parsing experiments with a large language model(LLM) finetuned on corpora annotated in the style of SDRT (Segmented Discourse Representation Theory Asher, 1993; Asher and Lascarides, 2003). The result is a discourse parser, Llamipa (Llama Incremental Parser), that leverages discourse context, leading to substantial performance gains over approaches that use encoder-only models to provide local, context-sensitive representations of discourse units. Furthermore, it can process discourse data incrementally, which is essential for the eventual use of discourse information in downstream tasks.
CLJun 26, 2024
Nebula: A discourse aware Minecraft BuilderAkshay Chaturvedi, Kate Thompson, Nicholas Asher
When engaging in collaborative tasks, humans efficiently exploit the semantic structure of a conversation to optimize verbal and nonverbal interactions. But in recent "language to code" or "language to action" models, this information is lacking. We show how incorporating the prior discourse and nonlinguistic context of a conversation situated in a nonlinguistic environment can improve the "language to action" component of such interactions. We finetune an LLM to predict actions based on prior context; our model, Nebula, doubles the net-action F1 score over the baseline on this task of Jayannavar et al.(2020). We also investigate our model's ability to construct shapes and understand location descriptions using a synthetic dataset
CLMay 11, 2023
COCKATIEL: COntinuous Concept ranKed ATtribution with Interpretable ELements for explaining neural net classifiers on NLP tasksFanny Jourdan, Agustin Picard, Thomas Fel et al.
Transformer architectures are complex and their use in NLP, while it has engendered many successes, makes their interpretability or explainability challenging. Recent debates have shown that attention maps and attribution methods are unreliable (Pruthi et al., 2019; Brunner et al., 2019). In this paper, we present some of their limitations and introduce COCKATIEL, which successfully addresses some of them. COCKATIEL is a novel, post-hoc, concept-based, model-agnostic XAI technique that generates meaningful explanations from the last layer of a neural net model trained on an NLP classification task by using Non-Negative Matrix Factorization (NMF) to discover the concepts the model leverages to make predictions and by exploiting a Sensitivity Analysis to estimate accurately the importance of each of these concepts for the model. It does so without compromising the accuracy of the underlying model or requiring a new one to be trained. We conduct experiments in single and multi-aspect sentiment analysis tasks and we show COCKATIEL's superior ability to discover concepts that align with humans' on Transformer models without any supervision, we objectively verify the faithfulness of its explanations through fidelity metrics, and we showcase its ability to provide meaningful explanations in two different datasets.
AIOct 19, 2021
Interpretive BlindnessNicholas Asher, Julie Hunter
We model here an epistemic bias we call \textit{interpretive blindness} (IB). IB is a special problem for learning from testimony, in which one acquires information only from text or conversation. We show that IB follows from a co-dependence between background beliefs and interpretation in a Bayesian setting and the nature of contemporary testimony. We argue that a particular characteristic contemporary testimony, \textit{argumentative completeness}, can preclude learning in hierarchical Bayesian settings, even in the presence of constraints that are designed to promote good epistemic practices.
AIAug 30, 2021
Transport-based Counterfactual ModelsLucas de Lara, Alberto González-Sanz, Nicholas Asher et al.
Counterfactual frameworks have grown popular in machine learning for both explaining algorithmic decisions but also defining individual notions of fairness, more intuitive than typical group fairness conditions. However, state-of-the-art models to compute counterfactuals are either unrealistic or unfeasible. In particular, while Pearl's causal inference provides appealing rules to calculate counterfactuals, it relies on a model that is unknown and hard to discover in practice. We address the problem of designing realistic and feasible counterfactuals in the absence of a causal model. We define transport-based counterfactual models as collections of joint probability distributions between observable distributions, and show their connection to causal counterfactuals. More specifically, we argue that optimal-transport theory defines relevant transport-based counterfactual models, as they are numerically feasible, statistically-faithful, and can coincide under some assumptions with causal counterfactual models. Finally, these models make counterfactual approaches to fairness feasible, and we illustrate their practicality and efficiency on fair learning. With this paper, we aim at laying out the theoretical foundations for a new, implementable approach to counterfactual thinking.
AIJul 4, 2021
Efficient Explanations for Knowledge Compilation LanguagesXuanxiang Huang, Yacine Izza, Alexey Ignatiev et al.
Knowledge compilation (KC) languages find a growing number of practical uses, including in Constraint Programming (CP) and in Machine Learning (ML). In most applications, one natural question is how to explain the decisions made by models represented by a KC language. This paper shows that for many of the best known KC languages, well-known classes of explanations can be computed in polynomial time. These classes include deterministic decomposable negation normal form (d-DNNF), and so any KC language that is strictly less succinct than d-DNNF. Furthermore, the paper also investigates the conditions under which polynomial time computation of explanations can be extended to KC languages more succinct than d-DNNF.
LGDec 21, 2020
On Relating 'Why?' and 'Why Not?' ExplanationsAlexey Ignatiev, Nina Narodytska, Nicholas Asher et al.
Explanations of Machine Learning (ML) models often address a 'Why?' question. Such explanations can be related with selecting feature-value pairs which are sufficient for the prediction. Recent work has investigated explanations that address a 'Why Not?' question, i.e. finding a change of feature values that guarantee a change of prediction. Given their goals, these two forms of explaining predictions of ML models appear to be mostly unrelated. However, this paper demonstrates otherwise, and establishes a rigorous formal relationship between 'Why?' and 'Why Not?' explanations. Concretely, the paper proves that, for any given instance, 'Why?' explanations are minimal hitting sets of 'Why Not?' explanations and vice-versa. Furthermore, the paper devises novel algorithms for extracting and enumerating both forms of explanations.
AIJan 21, 2020
Adequate and fair explanationsNicholas Asher, Soumya Paul, Chris Russell
Explaining sophisticated machine-learning based systems is an important issue at the foundations of AI. Recent efforts have shown various methods for providing explanations. These approaches can be broadly divided into two schools: those that provide a local and human interpreatable approximation of a machine learning algorithm, and logical approaches that exactly characterise one aspect of the decision. In this paper we focus upon the second school of exact explanations with a rigorous logical foundation. There is an epistemological problem with these exact methods. While they can furnish complete explanations, such explanations may be too complex for humans to understand or even to write down in human readable form. Interpretability requires epistemically accessible explanations, explanations humans can grasp. Yet what is a sufficiently complete epistemically accessible explanation still needs clarification. We do this here in terms of counterfactuals, following [Wachter et al., 2017]. With counterfactual explanations, many of the assumptions needed to provide a complete explanation are left implicit. To do so, counterfactual explanations exploit the properties of a particular data point or sample, and as such are also local as well as partial explanations. We explore how to move from local partial explanations to what we call complete local explanations and then to global ones. But to preserve accessibility we argue for the need for partiality. This partiality makes it possible to hide explicit biases present in the algorithm that may be injurious or unfair.We investigate how easy it is to uncover these biases in providing complete and fair explanations by exploiting the structure of the set of counterfactuals providing a complete local explanation.
CLJun 29, 2018
Bias in Semantic and Discourse InterpretationNicholas Asher, Soumya Paul
In this paper, we show how game-theoretic work on conversation combined with a theory of discourse structure provides a framework for studying interpretive bias. Interpretive bias is an essential feature of learning and understanding but also something that can be used to pervert or subvert the truth. The framework we develop here provides tools for understanding and analyzing the range of interpretive biases and the factors that contribute to them.