CYMar 6
The Values of Value in AI Adoption: Rethinking Efficiency in UX Designers' WorkplacesInha Cha, Catherine Wieczorek, Richmond Y. Wong · gatech
Although organizations increasingly position AI adoption as a pathway to competitiveness and innovation, organizations' perspectives on productivity and efficiency often clash with workers' perspectives on AI's economic and social value. Through design workshops with 15 UX designers, we examine how AI adoption unfolds across individual, team, and organizational scales. At the individual level, designers weighed efficiency, skill development, and professional worth. At the team level, they negotiated collaboration, responsibility, and rigor. At the organizational level, adoption was shaped by compliance requirements and organizational norms. Across these scales, discourses of efficiency carried social and ethical dimensions of responsibility, trust, and autonomy. We view adoption as a site where roles, relationships, and power are reconfigured. We argue that AI adoption should be understood as a process of negotiating values, and call for future work examining how AI systems redistribute responsibility among team members, while understanding how such shifts could strengthen worker agency.
CLFeb 9, 2024Code
The Generative AI Paradox on Evaluation: What It Can Solve, It May Not EvaluateJuhyun Oh, Eunsu Kim, Inha Cha et al.
This paper explores the assumption that Large Language Models (LLMs) skilled in generation tasks are equally adept as evaluators. We assess the performance of three LLMs and one open-source LM in Question-Answering (QA) and evaluation tasks using the TriviaQA (Joshi et al., 2017) dataset. Results indicate a significant disparity, with LLMs exhibiting lower performance in evaluation tasks compared to generation tasks. Intriguingly, we discover instances of unfaithful evaluation where models accurately evaluate answers in areas where they lack competence, underscoring the need to examine the faithfulness and trustworthiness of LLMs as evaluators. This study contributes to the understanding of "the Generative AI Paradox" (West et al., 2023), highlighting a need to explore the correlation between generative excellence and evaluation proficiency, and the necessity to scrutinize the faithfulness aspect in model evaluations.
CLSep 1, 2025
Culture is Everywhere: A Call for Intentionally Cultural EvaluationJuhyun Oh, Inha Cha, Michael Saxon et al. · cmu
The prevailing ``trivia-centered paradigm'' for evaluating the cultural alignment of large language models (LLMs) is increasingly inadequate as these models become more advanced and widely deployed. Existing approaches typically reduce culture to static facts or values, testing models via multiple-choice or short-answer questions that treat culture as isolated trivia. Such methods neglect the pluralistic and interactive realities of culture, and overlook how cultural assumptions permeate even ostensibly ``neutral'' evaluation settings. In this position paper, we argue for \textbf{intentionally cultural evaluation}: an approach that systematically examines the cultural assumptions embedded in all aspects of evaluation, not just in explicitly cultural tasks. We systematically characterize the what, how, and circumstances by which culturally contingent considerations arise in evaluation, and emphasize the importance of researcher positionality for fostering inclusive, culturally aligned NLP research. Finally, we discuss implications and future directions for moving beyond current benchmarking practices, discovering important applications that we don't know exist, and involving communities in evaluation design through HCI-inspired participatory methodologies.