Niloufar Salehi

HC
h-index23
10papers
583citations
Novelty30%
AI Score26

10 Papers

HCMay 13, 2022
Beyond General Purpose Machine Translation: The Need for Context-specific Empirical Research to Design for Appropriate User Trust

Wesley Hanwen Deng, Nikita Mehandru, Samantha Robertson et al. · cmu

Machine Translation (MT) has the potential to help people overcome language barriers and is widely used in high-stakes scenarios, such as in hospitals. However, in order to use MT reliably and safely, users need to understand when to trust MT outputs and how to assess the quality of often imperfect translation results. In this paper, we discuss research directions to support users to calibrate trust in MT systems. We share findings from an empirical study in which we conducted semi-structured interviews with 20 clinicians to understand how they communicate with patients across language barriers, and if and how they use MT systems. Based on our findings, we advocate for empirical research on how MT systems are used in practice as an important first step to addressing the challenges in building appropriate trust between users and MT tools.

CLOct 25, 2023
Physician Detection of Clinical Harm in Machine Translation: Quality Estimation Aids in Reliance and Backtranslation Identifies Critical Errors

Nikita Mehandru, Sweta Agrawal, Yimin Xiao et al.

A major challenge in the practical use of Machine Translation (MT) is that users lack guidance to make informed decisions about when to rely on outputs. Progress in quality estimation research provides techniques to automatically assess MT quality, but these techniques have primarily been evaluated in vitro by comparison against human judgments outside of a specific context of use. This paper evaluates quality estimation feedback in vivo with a human study simulating decision-making in high-stakes medical settings. Using Emergency Department discharge instructions, we study how interventions based on quality estimation versus backtranslation assist physicians in deciding whether to show MT outputs to a patient. We find that quality estimation improves appropriate reliance on MT, but backtranslation helps physicians detect more clinically harmful errors that QE alone often misses.

HCMar 24, 2025
Generative AI in Knowledge Work: Design Implications for Data Navigation and Decision-Making

Bhada Yun, Dana Feng, Ace S. Chen et al.

Our study of 20 knowledge workers revealed a common challenge: the difficulty of synthesizing unstructured information scattered across multiple platforms to make informed decisions. Drawing on their vision of an ideal knowledge synthesis tool, we developed Yodeai, an AI-enabled system, to explore both the opportunities and limitations of AI in knowledge work. Through a user study with 16 product managers, we identified three key requirements for Generative AI in knowledge work: adaptable user control, transparent collaboration mechanisms, and the ability to integrate background knowledge with external information. However, we also found significant limitations, including overreliance on AI, user isolation, and contextual factors outside the AI's reach. As AI tools become increasingly prevalent in professional settings, we propose design principles that emphasize adaptability to diverse workflows, accountability in personal and collaborative contexts, and context-aware interoperability to guide the development of human-centered AI systems for product managers and knowledge workers.

CYMar 13, 2024
(Beyond) Reasonable Doubt: Challenges that Public Defenders Face in Scrutinizing AI in Court

Angela Jin, Niloufar Salehi

Accountable use of AI systems in high-stakes settings relies on making systems contestable. In this paper we study efforts to contest AI systems in practice by studying how public defenders scrutinize AI in court. We present findings from interviews with 17 people in the U.S. public defense community to understand their perceptions of and experiences scrutinizing computational forensic software (CFS) -- automated decision systems that the government uses to convict and incarcerate, such as facial recognition, gunshot detection, and probabilistic genotyping tools. We find that our participants faced challenges assessing and contesting CFS reliability due to difficulties (a) navigating how CFS is developed and used, (b) overcoming judges and jurors' non-critical perceptions of CFS, and (c) gathering CFS expertise. To conclude, we provide recommendations that center the technical, social, and institutional context to better position interventions such as performance evaluations to support contestability in practice.

HCNov 1, 2021
Bridging Action Frames: Instagram Infographics in U.S.Ethnic Movements

Darya Kaviani, Niloufar Salehi

Instagram infographics are a digital activism tool that have redefined action frames for technology-facilitated social movements. From the 1960s through the 1980s, United States ethnic movements practiced collective action: ideologically unified, resource-intensive traditional activism. Today, technologically enabled movements have been categorized as practicing connective action: individualized, low-resource online activism. Yet, we argue that Instagram infographics are both connective and collective. This paper juxtaposes the insights of past and present U.S. ethnic movement activists and analyzes Black Lives Matter Instagram data over the course of 7 years (2014-2020). We find that Instagram infographic activism bridges connective and collective action in three ways: (1) Scope for Education: Visually enticing and digestible infographics reduce the friction of information dissemination, facilitating collective movement education while preserving customizability. (2) Reconciliation for Credibility: Activists use connective features to combat infographic misinformation and resolve internal differences, creating a trusted collective movement front. (3) High-Resource Efforts for Transformative Change: Instagram infographic activism has been paired with boots on the ground and action-oriented content, curating a connective-to-collective pipeline that expends movement resources. Our work unveils the vitality of evaluating digital activism action frames at the movement integration level, exemplifies the powerful coexistence of connective and collective action, and offers meaningful design implications for activists seeking to leverage this novel tool.

HCJan 25, 2021
Modeling Assumptions Clash with the Real World: Transparency, Equity, and Community Challenges for Student Assignment Algorithms

Samantha Robertson, Tonya Nguyen, Niloufar Salehi

Across the United States, a growing number of school districts are turning to matching algorithms to assign students to public schools. The designers of these algorithms aimed to promote values such as transparency, equity, and community in the process. However, school districts have encountered practical challenges in their deployment. In fact, San Francisco Unified School District voted to stop using and completely redesign their student assignment algorithm because it was not promoting educational equity in practice. We analyze this system using a Value Sensitive Design approach and find that one reason values are not met in practice is that the system relies on modeling assumptions about families' priorities, constraints, and goals that clash with the real world. These assumptions overlook the complex barriers to ideal participation that many families face, particularly because of socioeconomic inequalities. We argue that direct, ongoing engagement with stakeholders is central to aligning algorithmic values with real world conditions. In doing so we must broaden how we evaluate algorithms while recognizing the limitations of purely algorithmic solutions in addressing complex socio-political problems.

HCJan 13, 2021
Whither AutoML? Understanding the Role of Automation in Machine Learning Workflows

Doris Xin, Eva Yiwei Wu, Doris Jung-Lin Lee et al.

Efforts to make machine learning more widely accessible have led to a rapid increase in Auto-ML tools that aim to automate the process of training and deploying machine learning. To understand how Auto-ML tools are used in practice today, we performed a qualitative study with participants ranging from novice hobbyists to industry researchers who use Auto-ML tools. We present insights into the benefits and deficiencies of existing tools, as well as the respective roles of the human and automation in ML workflows. Finally, we discuss design implications for the future of Auto-ML tool development. We argue that instead of full automation being the ultimate goal of Auto-ML, designers of these tools should focus on supporting a partnership between the user and the Auto-ML tool. This means that a range of Auto-ML tools will need to be developed to support varying user goals such as simplicity, reproducibility, and reliability.

CYJul 13, 2020
What If I Don't Like Any Of The Choices? The Limits of Preference Elicitation for Participatory Algorithm Design

Samantha Robertson, Niloufar Salehi

Emerging methods for participatory algorithm design have proposed collecting and aggregating individual stakeholder preferences to create algorithmic systems that account for those stakeholders' values. Using algorithmic student assignment as a case study, we argue that optimizing for individual preference satisfaction in the distribution of limited resources may actually inhibit progress towards social and distributive justice. Individual preferences can be a useful signal but should be expanded to support more expressive and inclusive forms of democratic participation.

HCOct 26, 2016
Huddler: Convening Stable and Familiar Crowd Teams Despite Unpredictable Availability

Niloufar Salehi, Andrew McCabe, Melissa Valentine et al.

Distributed, parallel crowd workers can accomplish simple tasks through workflows, but teams of collaborating crowd workers are necessary for complex goals. Unfortunately, a fundamental condition for effective teams - familiarity with other members - stands in contrast to crowd work's flexible, on-demand nature. We enable effective crowd teams with Huddler, a system for workers to assemble familiar teams even under unpredictable availability and strict time constraints. Huddler utilizes a dynamic programming algorithm to optimize for highly familiar teammates when individual availability is unknown. We first present a field experiment that demonstrates the value of familiarity for crowd teams: familiar crowd teams doubled the performance of ad-hoc (unfamiliar) teams on a collaborative task. We then report a two-week field deployment wherein Huddler enabled crowd workers to convene highly familiar teams in 18 minutes on average. This research advances the goal of supporting long-term, team-based collaborations without sacrificing the flexibility of crowd work.

HCFeb 22, 2016
Atelier: Repurposing Expert Crowdsourcing Tasks as Micro-internships

Ryo Suzuki, Niloufar Salehi, Michelle S. Lam et al.

Expert crowdsourcing marketplaces have untapped potential to empower workers' career and skill development. Currently, many workers cannot afford to invest the time and sacrifice the earnings required to learn a new skill, and a lack of experience makes it difficult to get job offers even if they do. In this paper, we seek to lower the threshold to skill development by repurposing existing tasks on the marketplace as mentored, paid, real-world work experiences, which we refer to as micro-internships. We instantiate this idea in Atelier, a micro-internship platform that connects crowd interns with crowd mentors. Atelier guides mentor-intern pairs to break down expert crowdsourcing tasks into milestones, review intermediate output, and problem-solve together. We conducted a field experiment comparing Atelier's mentorship model to a non-mentored alternative on a real-world programming crowdsourcing task, finding that Atelier helped interns maintain forward progress and absorb best practices.