HCApr 12, 2023
Angler: Helping Machine Translation Practitioners Prioritize Model ImprovementsSamantha Robertson, Zijie J. Wang, Dominik Moritz et al. · apple-ml, cmu
Machine learning (ML) models can fail in unexpected ways in the real world, but not all model failures are equal. With finite time and resources, ML practitioners are forced to prioritize their model debugging and improvement efforts. Through interviews with 13 ML practitioners at Apple, we found that practitioners construct small targeted test sets to estimate an error's nature, scope, and impact on users. We built on this insight in a case study with machine translation models, and developed Angler, an interactive visual analytics tool to help practitioners prioritize model improvements. In a user study with 7 machine translation experts, we used Angler to understand prioritization practices when the input space is infinite, and obtaining reliable signals of model quality is expensive. Our study revealed that participants could form more interesting and user-focused hypotheses for prioritization by analyzing quantitative summary statistics and qualitatively assessing data by reading sentences.
HCMay 13, 2022
Beyond General Purpose Machine Translation: The Need for Context-specific Empirical Research to Design for Appropriate User TrustWesley Hanwen Deng, Nikita Mehandru, Samantha Robertson et al. · cmu
Machine Translation (MT) has the potential to help people overcome language barriers and is widely used in high-stakes scenarios, such as in hospitals. However, in order to use MT reliably and safely, users need to understand when to trust MT outputs and how to assess the quality of often imperfect translation results. In this paper, we discuss research directions to support users to calibrate trust in MT systems. We share findings from an empirical study in which we conducted semi-structured interviews with 20 clinicians to understand how they communicate with patients across language barriers, and if and how they use MT systems. Based on our findings, we advocate for empirical research on how MT systems are used in practice as an important first step to addressing the challenges in building appropriate trust between users and MT tools.
HCJan 25, 2021
Modeling Assumptions Clash with the Real World: Transparency, Equity, and Community Challenges for Student Assignment AlgorithmsSamantha Robertson, Tonya Nguyen, Niloufar Salehi
Across the United States, a growing number of school districts are turning to matching algorithms to assign students to public schools. The designers of these algorithms aimed to promote values such as transparency, equity, and community in the process. However, school districts have encountered practical challenges in their deployment. In fact, San Francisco Unified School District voted to stop using and completely redesign their student assignment algorithm because it was not promoting educational equity in practice. We analyze this system using a Value Sensitive Design approach and find that one reason values are not met in practice is that the system relies on modeling assumptions about families' priorities, constraints, and goals that clash with the real world. These assumptions overlook the complex barriers to ideal participation that many families face, particularly because of socioeconomic inequalities. We argue that direct, ongoing engagement with stakeholders is central to aligning algorithmic values with real world conditions. In doing so we must broaden how we evaluate algorithms while recognizing the limitations of purely algorithmic solutions in addressing complex socio-political problems.
CYJul 13, 2020
What If I Don't Like Any Of The Choices? The Limits of Preference Elicitation for Participatory Algorithm DesignSamantha Robertson, Niloufar Salehi
Emerging methods for participatory algorithm design have proposed collecting and aggregating individual stakeholder preferences to create algorithmic systems that account for those stakeholders' values. Using algorithmic student assignment as a case study, we argue that optimizing for individual preference satisfaction in the distribution of limited resources may actually inhibit progress towards social and distributive justice. Individual preferences can be a useful signal but should be expanded to support more expressive and inclusive forms of democratic participation.