LGOct 22, 2022
Policy Optimization with Advantage Regularization for Long-Term Fairness in Decision SystemsEric Yang Yu, Zhizhen Qin, Min Kyung Lee et al.
Long-term fairness is an important factor of consideration in designing and deploying learning-based decision systems in high-stake decision-making contexts. Recent work has proposed the use of Markov Decision Processes (MDPs) to formulate decision-making with long-term fairness requirements in dynamically changing environments, and demonstrated major challenges in directly deploying heuristic and rule-based policies that worked well in static environments. We show that policy optimization methods from deep reinforcement learning can be used to find strictly better decision policies that can often achieve both higher overall utility and less violation of the fairness requirements, compared to previously-known strategies. In particular, we propose new methods for imposing fairness requirements in policy optimization by regularizing the advantage evaluation of different actions. Our proposed methods make it easy to impose fairness constraints without reward engineering or sacrificing training efficiency. We perform detailed analyses in three established case studies, including attention allocation in incident monitoring, bank loan approval, and vaccine distribution in population networks.
HCAug 14, 2023
Human-centered NLP Fact-checking: Co-Designing with Fact-checkers using Matchmaking for AIHoujiang Liu, Anubrata Das, Alexander Boltz et al.
While many Natural Language Processing (NLP) techniques have been proposed for fact-checking, both academic research and fact-checking organizations report limited adoption of such NLP work due to poor alignment with fact-checker practices, values, and needs. To address this, we investigate a co-design method, Matchmaking for AI, to enable fact-checkers, designers, and NLP researchers to collaboratively identify what fact-checker needs should be addressed by technology, and to brainstorm ideas for potential solutions. Co-design sessions we conducted with 22 professional fact-checkers yielded a set of 11 design ideas that offer a "north star", integrating fact-checker criteria into novel NLP design concepts. These concepts range from pre-bunking misinformation, efficient and personalized monitoring misinformation, proactively reducing fact-checker potential biases, and collaborative writing fact-check reports. Our work provides new insights into both human-centered fact-checking research and practice and AI co-design research.
AIMay 4
Making the Invisible Visible: Understanding the Mismatch Between Organizational Goals and Worker Experiences in AI AdoptionChristine P. Lee, Min Kyung Lee, Bilge Mutlu
While AI is often introduced into organizations to drive innovation and efficiency, many adoption efforts fail as workers resist and struggle to integrate these systems. These failures point to a deeper issue: workers, the very people expected to collaborate with AI, are often invisible in decisions about how AI is designed and used. Drawing on interviews with professionals who interact with AI systems daily in healthcare, finance, and management, we examine the disconnect between organizational expectations and worker experiences. We identify key barriers, including poor usability and interoperability, misaligned expectations, limited control, and insufficient communication. These challenges highlight a gap between how organizations implement AI and the evolving worker needs, tasks, and workflows that it fails to support. We argue that successful adoption requires recognizing workers as central to AI integration and propose adaptation strategies at the individual, task, and organizational levels to better align AI systems with real-world practices.
HCOct 16, 2025
State Your Intention to Steer Your Attention: An AI Assistant for Intentional Digital LivingJuheon Choi, Juyong Lee, Jian Kim et al.
When working on digital devices, people often face distractions that can lead to a decline in productivity and efficiency, as well as negative psychological and emotional impacts. To address this challenge, we introduce a novel Artificial Intelligence (AI) assistant that elicits a user's intention, assesses whether ongoing activities are in line with that intention, and provides gentle nudges when deviations occur. The system leverages a large language model to analyze screenshots, application titles, and URLs, issuing notifications when behavior diverges from the stated goal. Its detection accuracy is refined through initial clarification dialogues and continuous user feedback. In a three-week, within-subjects field deployment with 22 participants, we compared our assistant to both a rule-based intent reminder system and a passive baseline that only logged activity. Results indicate that our AI assistant effectively supports users in maintaining focus and aligning their digital behavior with their intentions. Our source code is publicly available at https://intentassistant.github.io
CLFeb 10, 2025
GuideLLM: Exploring LLM-Guided Conversation with Applications in Autobiography InterviewingJinhao Duan, Xinyu Zhao, Zhuoxuan Zhang et al.
Although Large Language Models (LLMs) succeed in human-guided conversations such as instruction following and question answering, the potential of LLM-guided conversations-where LLMs direct the discourse and steer the conversation's objectives-remains under-explored. In this study, we first characterize LLM-guided conversation into three fundamental components: (i) Goal Navigation; (ii) Context Management; (iii) Empathetic Engagement, and propose GuideLLM as an installation. We then implement an interviewing environment for the evaluation of LLM-guided conversation. Specifically, various topics are involved in this environment for comprehensive interviewing evaluation, resulting in around 1.4k turns of utterances, 184k tokens, and over 200 events mentioned during the interviewing for each chatbot evaluation. We compare GuideLLM with 6 state-of-the-art LLMs such as GPT-4o and Llama-3-70b-Instruct, from the perspective of interviewing quality, and autobiography generation quality. For automatic evaluation, we derive user proxies from multiple autobiographies and employ LLM-as-a-judge to score LLM behaviors. We further conduct a human-involved experiment by employing 45 human participants to chat with GuideLLM and baselines. We then collect human feedback, preferences, and ratings regarding the qualities of conversation and autobiography. Experimental results indicate that GuideLLM significantly outperforms baseline LLMs in automatic evaluation and achieves consistent leading performances in human ratings.
HCFeb 15, 2022
IF-City: Intelligible Fair City Planning to Measure, Explain and Mitigate InequalityYan Lyu, Hangxin Lu, Min Kyung Lee et al.
With the increasing pervasiveness of Artificial Intelligence (AI), many visual analytics tools have been proposed to examine fairness, but they mostly focus on data scientist users. Instead, tackling fairness must be inclusive and involve domain experts with specialized tools and workflows. Thus, domain-specific visualizations are needed for algorithmic fairness. Furthermore, while much work on AI fairness has focused on predictive decisions, less has been done for fair allocation and planning, which require human expertise and iterative design to integrate myriad constraints. We propose the Intelligible Fair Allocation (IF-Alloc) Framework that leverages explanations of causal attribution (Why), contrastive (Why Not) and counterfactual reasoning (What If, How To) to aid domain experts to assess and alleviate unfairness in allocation problems. We apply the framework to fair urban planning for designing cities that provide equal access to amenities and benefits for diverse resident types. Specifically, we propose an interactive visual tool, Intelligible Fair City Planner (IF-City), to help urban planners to perceive inequality across groups, identify and attribute sources of inequality, and mitigate inequality with automatic allocation simulations and constraint-satisfying recommendations. We demonstrate and evaluate the usage and usefulness of IF-City on a real neighborhood in New York City, US, with practicing urban planners from multiple countries, and discuss generalizing our findings, application, and framework to other use cases and applications of fair allocation.
HCMay 22, 2021
Human-AI Collaboration with Bandit FeedbackRuijiang Gao, Maytal Saar-Tsechansky, Maria De-Arteaga et al.
Human-machine complementarity is important when neither the algorithm nor the human yield dominant performance across all instances in a given domain. Most research on algorithmic decision-making solely centers on the algorithm's performance, while recent work that explores human-machine collaboration has framed the decision-making problems as classification tasks. In this paper, we first propose and then develop a solution for a novel human-machine collaboration problem in a bandit feedback setting. Our solution aims to exploit the human-machine complementarity to maximize decision rewards. We then extend our approach to settings with multiple human decision makers. We demonstrate the effectiveness of our proposed methods using both synthetic and real human responses, and find that our methods outperform both the algorithm and the human when they each make decisions on their own. We also show how personalized routing in the presence of multiple human decision-makers can further improve the human-machine team performance.