Laura Dabbish

HC
7papers
210citations
Novelty26%
AI Score43

7 Papers

HCMar 31
Locating Risk: Task Designers and the Challenge of Risk Disclosure in RAI Content Work

Alice Qian, Ryland Shaw, Laura Dabbish et al.

As AI systems are increasingly tested and deployed in open-ended and high-stakes domains, crowdworkers are often tasked with responsible AI (RAI) content work. These tasks include labeling violent content, moderating disturbing text, or simulating harmful behavior for red teaming exercises to shape AI system behaviors. While prior research efforts have highlighted the risks to worker well-being associated with RAI content work, far less attention has been paid to how these risks are communicated to workers by task designers or individuals who design and post RAI tasks. Existing transparency frameworks and guidelines, such as model cards, datasheets, and crowdworksheets, focus on documenting model information and dataset collection processes, but they overlook an important aspect of disclosing well-being risks to workers. In the absence of standard workflows or clear guidance, the consistent application of content warnings, consent flows, or other forms of well-being risk disclosure remains unclear. This study investigates how task designers approach risk disclosure in crowdsourced RAI tasks. Drawing on interviews with 23 task designers across academic and industry sectors, we examine how well-being risk is recognized, interpreted, and communicated in practice. Our findings highlight the need to support task designers in identifying and communicating risks not only to support crowdworker well-being but also to strengthen the ethical integrity and technical efficacy of AI development pipelines.

HCMay 8
Towards Apples to Apples for AI Evaluations: From Real-World Use Cases to Evaluation Scenarios

Yee-Yin Choong, Kristen Greene, Alice Qian et al.

AI measurement science has a wide variety of methodologies and measurements for comparing AI systems, resulting in what often appear to be "apples-to-oranges" comparisons across AI evaluations. To move toward "apples-to-apples" comparisons in real-world AI evaluations, this work advocates for methodological transparency in evaluation scenarios, operational grounding, and human-centered design (HCD) principles. We propose a repeatable process for transforming high-level use cases to detailed scenarios by eliciting use cases from subject matter experts (SMEs) via a structured AI Use Case Worksheet with six key elements: use case, sector, user (direct and indirect), intended outcomes, expected impacts (positive and negative), and KPIs and metrics. We demonstrate utility of the worksheet and process in the U.S. financial services sector. This paper reports on example high-level AI use cases identified by financial services sector SMEs: cyber defense enablement, developer productivity, financial crime aggregation, suspicious activity report (SAR) filing, credit memo generation, and internal call center support. These AI use cases provided are illustrative of the process and not exhaustive. Central to our work is a three-stage expansion pipeline combining LLM prompting with human reviews to generate 107 scenarios from those use cases elicited from SMEs. This process integrates iterative human reviews at every juncture to ensure operational grounding: for scenario titles and descriptions; for core scenario elements like users, benefits and risks, and metrics; and for scenario narratives and evaluation objectives. Human checkpoints ensure scenarios remain reflective of real-world usage and human needs. We describe a validation rubric to assess scenario quality. By defining key scenario components, this work supports a more consistent and meaningful paradigm for human-centered AI evaluations.

HCJun 30, 2014Code
Transparency and Coordination in Peer Production

Laura Dabbish, Colleen Stuart, Jason Tsay et al.

This paper examines coordination in transparent work environments - environments where the content of work artifacts, and the actions taken on these artifacts, are fully visible to organizational members. Our qualitative study of a community of open source software developers revealed a coordination system characterized by interest-based, asynchronous interaction and knowledge transfer. At the core of asynchronous knowledge transfer, lies the concept of quasi-codification, which occurs when rich process knowledge is implicitly encoded in work artifacts. Our findings suggest that members are able to more selectively form dependencies, monitor the trajectory of projects, and make their work understandable to others which facilitates coordination. We discuss two important characteristics that enable coordination activities in a transparent environment: the presence of an imagined audience that dictates the way artifacts are crafted, and experience within the environment, that allows individuals to derive knowledge from these artifacts. By showing how transparency influences coordination, this research challenges previous conceptions of coordination for complex, collaborative work.

HCMar 31
Worker Discretion Advised: Co-designing Risk Disclosure in Crowdsourced Responsible AI (RAI) Content Work

Alice Qian, Ziqi Yang, Ryland Shaw et al.

Responsible AI (RAI) content work, such as annotation, moderation, or red teaming for AI safety, often exposes crowd workers to potentially harmful content. While prior work has underscored the importance of communicating well-being risk to employed content moderators, designing effective disclosure mechanisms for crowd workers while balancing worker protection with the needs of task designers and platforms remains largely unexamined. To address this gap, we conducted individual co-design sessions with 15 task designers, 11 crowdworkers, and 3 platform representatives. We investigated task designer preferences for support in disclosing tasks, worker preferences for receiving risk disclosure warnings, and how platform representatives envision their role in shaping risk disclosure practices. We identify design tensions and map the sociotechnical tradeoffs that shape disclosure practices. We contribute design recommendations and feature concepts for risk disclosure mechanisms in the context of RAI content work.

HCFeb 16, 2021
Significant Otter: Understanding the Role of Biosignals in Communication

Fannie Liu, Chunjong Park, Yu Jiang Tham et al.

With the growing ubiquity of wearable devices, sensed physiological responses provide new means to connect with others. While recent research demonstrates the expressive potential for biosignals, the value of sharing these personal data remains unclear. To understand their role in communication, we created Significant Otter, an Apple Watch/iPhone app that enables romantic partners to share and respond to each other's biosignals in the form of animated otter avatars. In a one-month study with 20 couples, participants used Significant Otter with biosignals sensing OFF and ON. We found that while sensing OFF enabled couples to keep in touch, sensing ON enabled easier and more authentic communication that fostered social connection. However, the addition of biosignals introduced concerns about autonomy and agency over the messages they sent. We discuss design implications and future directions for communication systems that recommend messages based on biosignals.

HCMay 25, 2020
Decentralized is not risk-free: Understanding public perceptions of privacy-utility trade-offs in COVID-19 contact-tracing apps

Tianshi Li, Jackie, Yang et al.

Contact-tracing apps have potential benefits in helping health authorities to act swiftly to halt the spread of COVID-19. However, their effectiveness is heavily dependent on their installation rate, which may be influenced by people's perceptions of the utility of these apps and any potential privacy risks due to the collection and releasing of sensitive user data (e.g., user identity and location). In this paper, we present a survey study that examined people's willingness to install six different contact-tracing apps after informing them of the risks and benefits of each design option (with a U.S.-only sample on Amazon Mechanical Turk, $N=208$). The six app designs covered two major design dimensions (centralized vs decentralized, basic contact tracing vs. also providing hotspot information), grounded in our analysis of existing contact-tracing app proposals. Contrary to assumptions of some prior work, we found that the majority of people in our sample preferred to install apps that use a centralized server for contact tracing, as they are more willing to allow a centralized authority to access the identity of app users rather than allowing tech-savvy users to infer the identity of diagnosed users. We also found that the majority of our sample preferred to install apps that share diagnosed users' recent locations in public places to show hotspots of infection. Our results suggest that apps using a centralized architecture with strong security protection to do basic contact tracing and providing users with other useful information such as hotspots of infection in public places may achieve a high adoption rate in the U.S.

HCApr 12, 2019
Animo: Sharing Biosignals on a Smartwatch for Lightweight Social Connection

Fannie Liu, Mario Esparza, Maria Pavlovskaia et al.

We present Animo, a smartwatch app that enables people to share and view each other's biosignals. We designed and engineered Animo to explore new ground for smartwatch-based biosignals social computing systems: identifying opportunities where these systems can support lightweight and mood-centric interactions. In our work we develop, explore, and evaluate several innovative features designed for dyadic communication of heart rate. We discuss the results of a two-week study (N=34), including new communication patterns participants engaged in, and outline the design landscape for communicating with biosignals on smartwatches.