CYAug 19, 2022
Atomist or Holist? A Diagnosis and Vision for More Productive Interdisciplinary AI Ethics DialogueTravis Greene, Amit Dhurandhar, Galit Shmueli
In response to growing recognition of the social impact of new AI-based technologies, major AI and ML conferences and journals now encourage or require papers to include ethics impact statements and undergo ethics reviews. This move has sparked heated debate concerning the role of ethics in AI research, at times devolving into name-calling and threats of "cancellation." We diagnose this conflict as one between atomist and holist ideologies. Among other things, atomists believe facts are and should be kept separate from values, while holists believe facts and values are and should be inextricable from one another. With the goal of reducing disciplinary polarization, we draw on numerous philosophical and historical sources to describe each ideology's core beliefs and assumptions. Finally, we call on atomists and holists within the ever-expanding data science community to exhibit greater empathy during ethical disagreements and propose four targeted strategies to ensure AI research benefits society.
LGApr 9, 2025
Beware of "Explanations" of AIDavid Martens, Galit Shmueli, Theodoros Evgeniou et al.
Understanding the decisions made and actions taken by increasingly complex AI system remains a key challenge. This has led to an expanding field of research in explainable artificial intelligence (XAI), highlighting the potential of explanations to enhance trust, support adoption, and meet regulatory standards. However, the question of what constitutes a "good" explanation is dependent on the goals, stakeholders, and context. At a high level, psychological insights such as the concept of mental model alignment can offer guidance, but success in practice is challenging due to social and technical factors. As a result of this ill-defined nature of the problem, explanations can be of poor quality (e.g. unfaithful, irrelevant, or incoherent), potentially leading to substantial risks. Instead of fostering trust and safety, poorly designed explanations can actually cause harm, including wrong decisions, privacy violations, manipulation, and even reduced AI adoption. Therefore, we caution stakeholders to beware of explanations of AI: while they can be vital, they are not automatically a remedy for transparency or responsible AI adoption, and their misuse or limitations can exacerbate harm. Attention to these caveats can help guide future research to improve the quality and impact of AI explanations.
MLMay 19, 2025
From What Ifs to Insights: Counterfactuals in Causal Inference vs. Explainable AIGalit Shmueli, David Martens, Jaewon Yoo et al.
Counterfactuals play a pivotal role in the two distinct data science fields of causal inference (CI) and explainable artificial intelligence (XAI). While the core idea behind counterfactuals remains the same in both fields--the examination of what would have happened under different circumstances--there are key differences in how they are used and interpreted. We introduce a formal definition that encompasses the multi-faceted concept of the counterfactual in CI and XAI. We then discuss how counterfactuals are used, evaluated, generated, and operationalized in CI vs. XAI, highlighting conceptual and practical differences. By comparing and contrasting the two, we hope to identify opportunities for cross-fertilization across CI and XAI.
CYAug 31, 2020
Beyond Our Behavior: The GDPR and Humanistic PersonalizationTravis Greene, Galit Shmueli
Personalization should take the human person seriously. This requires a deeper understanding of how recommender systems can shape both our self-understanding and identity. We unpack key European humanistic and philosophical ideas underlying the General Data Protection Regulation (GDPR) and propose a new paradigm of humanistic personalization. Humanistic personalization responds to the IEEE's call for Ethically Aligned Design (EAD) and is based on fundamental human capacities and values. Humanistic personalization focuses on narrative accuracy: the subjective fit between a person's self-narrative and both the input (personal data) and output of a recommender system. In doing so, we re-frame the distinction between implicit and explicit data collection as one of nonconscious ("organismic") behavior and conscious ("reflective") action. This distinction raises important ethical and interpretive issues related to agency, self-understanding, and political participation. Finally, we discuss how an emphasis on narrative accuracy can reduce opportunities for epistemic injustice done to data subjects.
CYAug 26, 2020
How to "Improve" Prediction Using Behavior ModificationGalit Shmueli, Ali Tafti
Many internet platforms that collect behavioral big data use it to predict user behavior for internal purposes and for their business customers (e.g., advertisers, insurers, security forces, governments, political consulting firms) who utilize the predictions for personalization, targeting, and other decision-making. Improving predictive accuracy is therefore extremely valuable. Data science researchers design algorithms, models, and approaches to improve prediction. Prediction is also improved with larger and richer data. Beyond improving algorithms and data, platforms can stealthily achieve better prediction accuracy by pushing users' behaviors towards their predicted values, using behavior modification techniques, thereby demonstrating more certain predictions. Such apparent "improved" prediction can result from employing reinforcement learning algorithms that combine prediction and behavior modification. This strategy is absent from the machine learning and statistics literature. Investigating its properties requires integrating causal with predictive notation. To this end, we incorporate Pearl's causal do(.) operator into the predictive vocabulary. We then decompose the expected prediction error given behavior modification, and identify the components impacting predictive power. Our derivation elucidates implications of such behavior modification to data scientists, platforms, their customers, and the humans whose behavior is manipulated. Behavior modification can make users' behavior more predictable and even more homogeneous; yet this apparent predictability might not generalize when business customers use predictions in practice. Outcomes pushed towards their predictions can be at odds with customers' intentions, and harmful to manipulated users.
MLDec 17, 2019
How Personal is Machine Learning Personalization?Travis Greene, Galit Shmueli
Though used extensively, the concept and process of machine learning (ML) personalization have generally received little attention from academics, practitioners, and the general public. We describe the ML approach as relying on the metaphor of the person as a feature vector and contrast this with humanistic views of the person. In light of the recent calls by the IEEE to consider the effects of ML on human well-being, we ask whether ML personalization can be reconciled with these humanistic views of the person, which highlight the importance of moral and social identity. As human behavior increasingly becomes digitized, analyzed, and predicted, to what extent do our subsequent decisions about what to choose, buy, or do, made both by us and others, reflect who we are as persons? This paper first explicates the term personalization by considering ML personalization and highlights its relation to humanistic conceptions of the person, then proposes several dimensions for evaluating the degree of personalization of ML personalized scores. By doing so, we hope to contribute to current debate on the issues of algorithmic bias, transparency, and fairness in machine learning.
MLJun 8, 2019
Lift Up and Act! Classifier Performance in Resource-Constrained ApplicationsGalit Shmueli
Classification tasks are common across many fields and applications where the decision maker's action is limited by resource constraints. In direct marketing only a subset of customers is contacted; scarce human resources limit the number of interviews to the most promising job candidates; limited donated organs are prioritized to those with best fit. In such scenarios, performance measures such as the classification matrix, ROC analysis, and even ranking metrics such as AUC measures outcomes different from the action of interest. At the same time, gains and lift that do measure the relevant outcome are rarely used by machine learners. In this paper we define resource-constrained classifier performance as a task distinguished from classification and ranking. We explain how gains and lift can lead to different algorithm choices and discuss the effect of class distribution.