CLApr 11, 2022
ProtoTEx: Explaining Model Decisions with Prototype TensorsAnubrata Das, Chitrank Gupta, Venelin Kovatchev et al.
We present ProtoTEx, a novel white-box NLP classification architecture based on prototype networks. ProtoTEx faithfully explains model decisions based on prototype tensors that encode latent clusters of training examples. At inference time, classification decisions are based on the distances between the input text and the prototype tensors, explained via the training examples most similar to the most influential prototypes. We also describe a novel interleaved training algorithm that effectively handles classes characterized by the absence of indicative features. On a propaganda detection task, ProtoTEx accuracy matches BART-large and exceeds BERT-large with the added benefit of providing faithful explanations. A user study also shows that prototype-based explanations help non-experts to better recognize propaganda in online news.
CLJun 29, 2022
longhorns at DADC 2022: How many linguists does it take to fool a Question Answering model? A systematic approach to adversarial attacksVenelin Kovatchev, Trina Chatterjee, Venkata S Govindarajan et al.
Developing methods to adversarially challenge NLP systems is a promising avenue for improving both model performance and interpretability. Here, we describe the approach of the team "longhorns" on Task 1 of the The First Workshop on Dynamic Adversarial Data Collection (DADC), which asked teams to manually fool a model on an Extractive Question Answering task. Our team finished first, with a model error rate of 62%. We advocate for a systematic, linguistically informed approach to formulating adversarial questions, and we describe the results of our pilot experiments, as well as our official submission.
CLJan 8, 2023
The State of Human-centered NLP Technology for Fact-checkingAnubrata Das, Houjiang Liu, Venelin Kovatchev et al.
Misinformation threatens modern society by promoting distrust in science, changing narratives in public health, heightening social polarization, and disrupting democratic elections and financial markets, among a myriad of other societal harms. To address this, a growing cadre of professional fact-checkers and journalists provide high-quality investigations into purported facts. However, these largely manual efforts have struggled to match the enormous scale of the problem. In response, a growing body of Natural Language Processing (NLP) technologies have been proposed for more scalable fact-checking. Despite tremendous growth in such research, however, practical adoption of NLP technologies for fact-checking still remains in its infancy today. In this work, we review the capabilities and limitations of the current NLP technologies for fact-checking. Our particular focus is to further chart the design space for how these technologies can be harnessed and refined in order to better meet the needs of human fact-checkers. To do so, we review key aspects of NLP-based fact-checking: task formulation, dataset construction, modeling, and human-centered strategies, such as explainable models and human-in-the-loop approaches. Next, we review the efficacy of applying NLP-based fact-checking tools to assist human fact-checkers. We recommend that future research include collaboration with fact-checker stakeholders early on in NLP research, as well as incorporation of human-centered design practices in model development, in order to further guide technology development for human use and practical adoption. Finally, we advocate for more research on benchmark development supporting extrinsic evaluation of human-centered fact-checking technologies.
CLApr 15, 2022
Finding Pareto Trade-offs in Fair and Accurate Detection of Toxic SpeechSoumyajit Gupta, Venelin Kovatchev, Anubrata Das et al.
Optimizing NLP models for fairness poses many challenges. Lack of differentiable fairness measures prevents gradient-based loss training or requires surrogate losses that diverge from the true metric of interest. In addition, competing objectives (e.g., accuracy vs. fairness) often require making trade-offs based on stakeholder preferences, but stakeholders may not know their preferences before seeing system performance under different trade-off settings. To address these challenges, we begin by formulating a differentiable version of a popular fairness measure, Accuracy Parity, to provide balanced accuracy across demographic groups. Next, we show how model-agnostic, HyperNetwork optimization can efficiently train arbitrary NLP model architectures to learn Pareto-optimal trade-offs between competing metrics. Focusing on the task of toxic language detection, we show the generality and efficacy of our methods across two datasets, three neural architectures, and three fairness losses.
HCAug 14, 2023
Human-centered NLP Fact-checking: Co-Designing with Fact-checkers using Matchmaking for AIHoujiang Liu, Anubrata Das, Alexander Boltz et al.
While many Natural Language Processing (NLP) techniques have been proposed for fact-checking, both academic research and fact-checking organizations report limited adoption of such NLP work due to poor alignment with fact-checker practices, values, and needs. To address this, we investigate a co-design method, Matchmaking for AI, to enable fact-checkers, designers, and NLP researchers to collaboratively identify what fact-checker needs should be addressed by technology, and to brainstorm ideas for potential solutions. Co-design sessions we conducted with 22 professional fact-checkers yielded a set of 11 design ideas that offer a "north star", integrating fact-checker criteria into novel NLP design concepts. These concepts range from pre-bunking misinformation, efficient and personalized monitoring misinformation, proactively reducing fact-checker potential biases, and collaborative writing fact-check reports. Our work provides new insights into both human-centered fact-checking research and practice and AI co-design research.
LGMay 31, 2025
Linear Representation Transferability Hypothesis: Leveraging Small Models to Steer Large ModelsFemi Bello, Anubrata Das, Fanzhi Zeng et al.
It has been hypothesized that neural networks with similar architectures trained on similar data learn shared representations relevant to the learning task. We build on this idea by extending the conceptual framework where representations learned across models trained on the same data can be expressed as linear combinations of a \emph{universal} set of basis features. These basis features underlie the learning task itself and remain consistent across models, regardless of scale. From this framework, we propose the \textbf{Linear Representation Transferability (LRT)} Hypothesis -- that there exists an affine transformation between the representation spaces of different models. To test this hypothesis, we learn affine mappings between the hidden states of models of different sizes and evaluate whether steering vectors -- directions in hidden state space associated with specific model behaviors -- retain their semantic effect when transferred from small to large language models using the learned mappings. We find strong empirical evidence that such affine mappings can preserve steering behaviors. These findings suggest that representations learned by small models can be used to guide the behavior of large models, and that the LRT hypothesis may be a promising direction on understanding representation alignment across model scales.
HCFeb 17, 2022
The Effects of Interactive AI Design on User Behavior: An Eye-tracking Study of Fact-checking COVID-19 ClaimsLi Shi, Nilavra Bhattacharya, Anubrata Das et al.
We conducted a lab-based eye-tracking study to investigate how the interactivity of an AI-powered fact-checking system affects user interactions, such as dwell time, attention, and mental resources involved in using the system. A within-subject experiment was conducted, where participants used an interactive and a non-interactive version of a mock AI fact-checking system and rated their perceived correctness of COVID-19 related claims. We collected web-page interactions, eye-tracking data, and mental workload using NASA-TLX. We found that the presence of the affordance of interactively manipulating the AI system's prediction parameters affected users' dwell times, and eye-fixations on AOIs, but not mental workload. In the interactive system, participants spent the most time evaluating claims' correctness, followed by reading news. This promising result shows a positive role of interactivity in a mixed-initiative AI-powered system.
CLSep 20, 2021
The Case for Claim Difficulty Assessment in Automatic Fact CheckingPrakhar Singh, Anubrata Das, Junyi Jessy Li et al.
Fact-checking is the process of evaluating the veracity of claims (i.e., purported facts). In this opinion piece, we raise an issue that has received little attention in prior work -- that some claims are far more difficult to fact-check than others. We discuss the implications this has for both practical fact-checking and research on automated fact-checking, including task formulation and dataset design. We report a manual analysis undertaken to explore factors underlying varying claim difficulty and identify several distinct types of difficulty. We motivate this new claim difficulty prediction task as beneficial to both automated fact-checking and practical fact-checking organizations.
IRMay 12, 2021
Fairness in Information Access SystemsMichael D. Ekstrand, Anubrata Das, Robin Burke et al.
Recommendation, information retrieval, and other information access systems pose unique challenges for investigating and applying the fairness and non-discrimination concepts that have been developed for studying other machine learning systems. While fair information access shares many commonalities with fair classification, the multistakeholder nature of information access applications, the rank-based problem setting, the centrality of personalization in many cases, and the role of user response complicate the problem of identifying precisely what types and operationalizations of fairness may be relevant, let alone measuring or promoting them. In this monograph, we present a taxonomy of the various dimensions of fair information access and survey the literature to date on this new and rapidly-growing topic. We preface this with brief introductions to information access and algorithmic fairness, to facilitate use of this work by scholars with experience in one (or neither) of these fields who wish to learn about their intersection. We conclude with several open problems in fair information access, along with some suggestions for how to approach research in this space.
IRJul 22, 2019
A Conceptual Framework for Evaluating Fairness in SearchAnubrata Das, Matthew Lease
While search efficacy has been evaluated traditionally on the basis of result relevance, fairness of search has attracted recent attention. In this work, we define a notion of distributional fairness and provide a conceptual framework for evaluating search results based on it. As part of this, we formulate a set of axioms which an ideal evaluation framework should satisfy for distributional fairness. We show how existing TREC test collections can be repurposed to study fairness, and we measure potential data bias to inform test collection design for fair search. A set of analyses show metric divergence between relevance and fairness, and we describe a simple but flexible interpolation strategy for integrating relevance and fairness into a single metric for optimization and evaluation.
IRJul 8, 2019
CobWeb: A Research Prototype for Exploring User Bias in Political Fact-CheckingAnubrata Das, Kunjan Mehta, Matthew Lease
The effect of user bias in fact-checking has not been explored extensively from a user-experience perspective. We estimate the user bias as a function of the user's perceived reputation of the news sources (e.g., a user with liberal beliefs may tend to trust liberal sources). We build an interface to communicate the role of estimated user bias in the context of a fact-checking task. We also explore the utility of helping users visualize their detected level of bias. 80% of the users of our system find that the presence of an indicator for user bias is useful in judging the veracity of a political claim.