Ningjing Tang

h-index3

5papers

12citations

Novelty24%

AI Score43

Ranked #79,000 of 205,806 authors (top 38%)#578 in HC (top 20%)

5 Papers

HCFeb 10Code

Navigating Uncertainties: How GenAI Developers Document Their Models on Open-Source Platforms

Ningjing Tang, Megan Li, Amy Winecoff et al.

Model documentation plays a crucial role in promoting transparency and responsible development of AI systems. With the rise of Generative AI (GenAI), open-source platforms have increasingly become hubs for hosting and distributing these models, prompting platforms like Hugging Face to develop dedicated model documentation guidelines that align with responsible AI principles. Despite these growing efforts, there remains a lack of understanding of how developers document their GenAI models on open-source platforms. Through interviews with 13 GenAI developers active on open-source platforms, we provide empirical insights into their documentation practices and challenges. Our analysis reveals that despite existing resources, developers of GenAI models still face multiple layers of uncertainties in their model documentation: (1) uncertainties about what specific content should be included; (2) uncertainties about how to effectively report key components of their models; and (3) uncertainties in deciding who should take responsibilities for various aspects of model documentation. Based on our findings, we discuss the implications for policymakers, open-source platforms, and the research community to support meaningful, effective and actionable model documentation in the GenAI era, including cultivating better community norms, building robust evaluation infrastructures, and clarifying roles and responsibilities.

84.7HCMay 2

Beyond the Single Turn: Reframing Refusals as Dynamic Experiences Embedded in the Context of Mental Health Support Interactions with LLMs

Ningjing Tang, Alice Qian, Qiaosi Wang et al.

Content Warning: This paper contains participant quotes and discussions related to mental health challenges, emotional distress, and suicidal ideation. Large language models (LLMs) are increasingly used for mental health support, yet the model safeguards -- particularly refusals to engage with sensitive content -- remain poorly understood from the perspectives of users and mental health professionals (MHPs) and have been reported to cause real-world harms. This paper presents findings from a sequential mixed-methods study examining how LLM refusals are experienced and interpreted in mental health support interactions. Through surveys (N=53) and in-depth interviews (N=16) with individuals using LLMs for mental health support and MHPs, we reveal that refusals are not isolated, single-turn system behaviors but rather constitute dynamic, multi-phase experiences: pre-refusal expectation formation, refusal triggering and encounter, refusal message framing, resource referral provision, and post-refusal outcomes. We contribute a multi-phase framework for evaluating refusals beyond binary policy compliance accuracy and design recommendations for future refusal mechanisms. These findings suggest that understanding LLM refusals requires moving beyond single-turn interactions toward recognizing them as holistic experiences embedded within users' support-seeking trajectories and the broader LLM design pipeline.

CYJun 2, 2025

A Closer Look at the Existing Risks of Generative AI: Mapping the Who, What, and How of Real-World Incidents

Megan Li, Wendy Bickersteth, Ningjing Tang et al.

Due to its general-purpose nature, Generative AI is applied in an ever-growing set of domains and tasks, leading to an expanding set of risks of harm impacting people, communities, society, and the environment. These risks may arise due to failures during the design and development of the technology, as well as during its release, deployment, or downstream usages and appropriations of its outputs. In this paper, building on prior taxonomies of AI risks, harms, and failures, we construct a taxonomy specifically for Generative AI failures and map them to the harms they precipitate. Through a systematic analysis of 499 publicly reported incidents, we describe what harms are reported, how they arose, and who they impact. We report the prevalence of each type of harm, underlying failure mode, and harmed stakeholder, as well as their common co-occurrences. We find that most reported incidents are caused by use-related issues but bring harm to parties beyond the end user(s) of the Generative AI system at fault, and that the landscape of Generative AI harms is distinct from that of traditional AI. Our work offers actionable insights to policymakers, developers, and Generative AI users. In particular, we call for the prioritization of non-technical risk and harm mitigation strategies, including public disclosures and education and careful regulatory stances.

HCFeb 9

Large Language Models in Peer-Run Community Behavioral Health Services: Understanding Peer Specialists and Service Users' Perspectives on Opportunities, Risks, and Mitigation Strategies

Cindy Peng, Megan Chai, Gao Mo et al.

Peer-run organizations (PROs) provide critical, recovery-based behavioral health support rooted in lived experience. As large language models (LLMs) enter this domain, their scale, conversationality, and opacity introduce new challenges for situatedness, trust, and autonomy. Partnering with Collaborative Support Programs of New Jersey (CSPNJ), a statewide PRO in the Northeastern United States, we used comicboarding, a co-design method, to conduct workshops with 16 peer specialists and 10 service users exploring perceptions of integrating an LLM-based recommendation system into peer support. Findings show that depending on how LLMs are introduced, constrained, and co-used, they can reconfigure in-room dynamics by sustaining, undermining, or amplifying the relational authority that grounds peer support. We identify opportunities, risks, and mitigation strategies across three tensions: bridging scale and locality, protecting trust and relational dynamics, and preserving peer autonomy amid efficiency gains. We contribute design implications that center lived-experience-in-the-loop, reframe trust as co-constructed, and position LLMs not as clinical tools but as relational collaborators in high-stakes, community-led care.

80.9HCApr 24

What People See (and Miss) About Generative AI Risks: Perceptions of Failures, Risks, and Who Should Address Them

Megan Li, Wendy Bickersteth, Ningjing Tang et al.

Despite growing concerns about the risks of Generative AI (GenAI), there is limited understanding of public perceptions of these risks and their associated failure modes -- defined as recurring patterns of sociotechnical breakdown across the GenAI lifecycle that contribute to risks of real-world harm. To address this gap, we present a survey instrument, validated with eight subject matter experts and deployed on a sample of 960 U.S.-based participants, to assess awareness and perceptions of GenAI's failure modes, their associated risks, and stakeholder responsibilities to address them. To support realism and content validity, our instrument is structured around scenarios grounded in publicly reported incidents and a taxonomy of GenAI's failure modes. Findings suggest that our instrument is (1) effective for assessing risk awareness and perceptions in a way that is grounded in people's current contexts of use, yet is extensible to new contexts that will inevitably arise; and (2) potentially useful for informing the design of AI literacy tools and interventions. We argue for AI literacy and governance approaches that align with how people encounter and reason about GenAI in everyday life.