Amy Winecoff

CY
h-index14
6papers
72citations
Novelty29%
AI Score44

6 Papers

CYFeb 19, 2023
Upvotes? Downvotes? No Votes? Understanding the relationship between reaction mechanisms and political discourse on Reddit

Orestis Papakyriakopoulos, Severin Engelmann, Amy Winecoff

A significant share of political discourse occurs online on social media platforms. Policymakers and researchers try to understand the role of social media design in shaping the quality of political discourse around the globe. In the past decades, scholarship on political discourse theory has produced distinct characteristics of different types of prominent political rhetoric such as deliberative, civic, or demagogic discourse. This study investigates the relationship between social media reaction mechanisms (i.e., upvotes, downvotes) and political rhetoric in user discussions by engaging in an in-depth conceptual analysis of political discourse theory. First, we analyze 155 million user comments in 55 political subforums on Reddit between 2010 and 2018 to explore whether users' style of political discussion aligns with the essential components of deliberative, civic, and demagogic discourse. Second, we perform a quantitative study that combines confirmatory factor analysis with difference in differences models to explore whether different reaction mechanism schemes (e.g., upvotes only, upvotes and downvotes, no reaction mechanisms) correspond with political user discussion that is more or less characteristic of deliberative, civic, or demagogic discourse. We produce three main takeaways. First, despite being "ideal constructs of political rhetoric," we find that political discourse theories describe political discussions on Reddit to a large extent. Second, we find that discussions in subforums with only upvotes, or both up- and downvotes are associated with user discourse that is more deliberate and civic. Third, social media discussions are most demagogic in subreddits with no reaction mechanisms at all. These findings offer valuable contributions for ongoing policy discussions on the relationship between social media interface design and respectful political discussion among users.

HCFeb 10Code
Navigating Uncertainties: How GenAI Developers Document Their Models on Open-Source Platforms

Ningjing Tang, Megan Li, Amy Winecoff et al.

Model documentation plays a crucial role in promoting transparency and responsible development of AI systems. With the rise of Generative AI (GenAI), open-source platforms have increasingly become hubs for hosting and distributing these models, prompting platforms like Hugging Face to develop dedicated model documentation guidelines that align with responsible AI principles. Despite these growing efforts, there remains a lack of understanding of how developers document their GenAI models on open-source platforms. Through interviews with 13 GenAI developers active on open-source platforms, we provide empirical insights into their documentation practices and challenges. Our analysis reveals that despite existing resources, developers of GenAI models still face multiple layers of uncertainties in their model documentation: (1) uncertainties about what specific content should be included; (2) uncertainties about how to effectively report key components of their models; and (3) uncertainties in deciding who should take responsibilities for various aspects of model documentation. Based on our findings, we discuss the implications for policymakers, open-source platforms, and the research community to support meaningful, effective and actionable model documentation in the GenAI era, including cultivating better community norms, building robust evaluation infrastructures, and clarifying roles and responsibilities.

HCDec 4, 2025
From Symptoms to Systems: An Expert-Guided Approach to Understanding Risks of Generative AI for Eating Disorders

Amy Winecoff, Kevin Klyman

Generative AI systems may pose serious risks to individuals vulnerable to eating disorders. Existing safeguards tend to overlook subtle but clinically significant cues, leaving many risks unaddressed. To better understand the nature of these risks, we conducted semi-structured interviews with 15 clinicians, researchers, and advocates with expertise in eating disorders. Using abductive qualitative analysis, we developed an expert-guided taxonomy of generative AI risks across seven categories: (1) providing generalized health advice; (2) encouraging disordered behaviors; (3) supporting symptom concealment; (4) creating thinspiration; (5) reinforcing negative self-beliefs; (6) promoting excessive focus on the body; and (7) perpetuating narrow views about eating disorders. Our results demonstrate how certain user interactions with generative AI systems intersect with clinical features of eating disorders in ways that may intensify risk. We discuss implications of our work, including approaches for risk assessment, safeguard design, and participatory evaluation practices with domain experts.

92.6CYApr 27
Safety Drift After Fine-Tuning: Evidence from High-Stakes Domains

Emaan Bilal Khan, Amy Winecoff, Miranda Bogen et al.

Foundation models are routinely fine-tuned for use in particular domains, yet safety assessments are typically conducted only on base models, implicitly assuming that safety properties persist through downstream adaptation. We test this assumption by analyzing the safety behavior of 100 models, including widely deployed fine-tunes in the medical and legal domains as well as controlled adaptations of open foundation models alongside their bases. Across general-purpose and domain-specific safety benchmarks, we find that benign fine-tuning induces large, heterogeneous, and often contradictory changes in measured safety: models frequently improve on some instruments while degrading on others, with substantial disagreement across evaluations. These results show that safety behavior is not stable under ordinary downstream adaptation, raising critical questions about governance and deployment practices centered on base-model evaluations. Without explicit re-evaluation of fine-tuned models in deployment-relevant contexts, such approaches fall short of adequately managing downstream risk, overlooking practical sources of harm -- failures that are especially consequential in high-stakes settings and challenge current accountability paradigms.

CYJul 19, 2021Code
T-RECS: A Simulation Tool to Study the Societal Impact of Recommender Systems

Eli Lucherini, Matthew Sun, Amy Winecoff et al.

Simulation has emerged as a popular method to study the long-term societal consequences of recommender systems. This approach allows researchers to specify their theoretical model explicitly and observe the evolution of system-level outcomes over time. However, performing simulation-based studies often requires researchers to build their own simulation environments from the ground up, which creates a high barrier to entry, introduces room for implementation error, and makes it difficult to disentangle whether observed outcomes are due to the model or the implementation. We introduce T-RECS, an open-sourced Python package designed for researchers to simulate recommendation systems and other types of sociotechnical systems in which an algorithm mediates the interactions between multiple stakeholders, such as users and content creators. To demonstrate the flexibility of T-RECS, we perform a replication of two prior simulation-based research on sociotechnical systems. We additionally show how T-RECS can be used to generate novel insights with minimal overhead. Our tool promotes reproducibility in this area of research, provides a unified language for simulating sociotechnical systems, and removes the friction of implementing simulations from scratch.

HCDec 7, 2021
Qualitative Analysis for Human Centered AI

Orestis Papakyriakopoulos, Elizabeth Anne Watkins, Amy Winecoff et al.

Human-centered artificial intelligence (AI) posits that machine learning and AI should be developed and applied in a socially aware way. In this article, we argue that qualitative analysis (QA) can be a valuable tool in this process, supplementing, informing, and extending the possibilities of AI models. We show this by describing how QA can be integrated in the current prediction paradigm of AI, assisting scientists in the process of selecting data, variables, and model architectures. Furthermore, we argue that QA can be a part of novel paradigms towards Human Centered AI. QA can support scientists and practitioners in practical problem solving and situated model development. It can also promote participatory design approaches, reveal understudied and emerging issues in AI systems, and assist policy making.