AIOct 17, 2023
Algorithmic RobustnessDavid Jensen, Brian LaMacchia, Ufuk Topcu et al.
Algorithmic robustness refers to the sustained performance of a computational system in the face of change in the nature of the environment in which that system operates or in the task that the system is meant to perform. Below, we motivate the importance of algorithmic robustness, present a conceptual framework, and highlight the relevant areas of research for which algorithmic robustness is relevant. Why robustness? Robustness is an important enabler of other goals that are frequently cited in the context of public policy decisions about computational systems, including trustworthiness, accountability, fairness, and safety. Despite this dependence, it tends to be under-recognized compared to these other concepts. This is unfortunate, because robustness is often more immediately achievable than these other ultimate goals, which can be more subjective and exacting. Thus, we highlight robustness as an important goal for researchers, engineers, regulators, and policymakers when considering the design, implementation, and deployment of computational systems. We urge researchers and practitioners to elevate the attention paid to robustness when designing and evaluating computational systems. For many key systems, the immediate question after any demonstration of high performance should be: "How robust is that performance to realistic changes in the task or environment?" Greater robustness will set the stage for systems that are more trustworthy, accountable, fair, and safe. Toward that end, this document provides a brief roadmap to some of the concepts and existing research around the idea of algorithmic robustness.
LGAug 6, 2024
MicroXercise: A Micro-Level Comparative and Explainable System for Remote Physical TherapyHanchen David Wang, Nibraas Khan, Anna Chen et al.
Recent global estimates suggest that as many as 2.41 billion individuals have health conditions that would benefit from rehabilitation services. Home-based Physical Therapy (PT) faces significant challenges in providing interactive feedback and meaningful observation for therapists and patients. To fill this gap, we present MicroXercise, which integrates micro-motion analysis with wearable sensors, providing therapists and patients with a comprehensive feedback interface, including video, text, and scores. Crucially, it employs multi-dimensional Dynamic Time Warping (DTW) and attribution-based explainable methods to analyze the existing deep learning neural networks in monitoring exercises, focusing on a high granularity of exercise. This synergistic approach is pivotal, providing output matching the input size to precisely highlight critical subtleties and movements in PT, thus transforming complex AI analysis into clear, actionable feedback. By highlighting these micro-motions in different metrics, such as stability and range of motion, MicroXercise significantly enhances the understanding and relevance of feedback for end-users. Comparative performance metrics underscore its effectiveness over traditional methods, such as a 39% and 42% improvement in Feature Mutual Information (FMI) and Continuity. MicroXercise is a step ahead in home-based physical therapy, providing a technologically advanced and intuitively helpful solution to enhance patient care and outcomes.
HCApr 3, 2024
Toward Safe Evolution of Artificial Intelligence (AI) based Conversational Agents to Support Adolescent Mental and Sexual Health Knowledge DiscoveryJinkyung Park, Vivek Singh, Pamela Wisniewski
Following the recent release of various Artificial Intelligence (AI) based Conversation Agents (CAs), adolescents are increasingly using CAs for interactive knowledge discovery on sensitive topics, including mental and sexual health topics. Exploring such sensitive topics through online search has been an essential part of adolescent development, and CAs can support their knowledge discovery on such topics through human-like dialogues. Yet, unintended risks have been documented with adolescents' interactions with AI-based CAs, such as being exposed to inappropriate content, false information, and/or being given advice that is detrimental to their mental and physical well-being (e.g., to self-harm). In this position paper, we discuss the current landscape and opportunities for CAs to support adolescents' mental and sexual health knowledge discovery. We also discuss some of the challenges related to ensuring the safety of adolescents when interacting with CAs regarding sexual and mental health topics. We call for a discourse on how to set guardrails for the safe evolution of AI-based CAs for adolescents.
HCApr 11, 2024
Leveraging Large Language Models (LLMs) to Support Collaborative Human-AI Online Risk Data AnnotationJinkyung Park, Pamela Wisniewski, Vivek Singh
In this position paper, we discuss the potential for leveraging LLMs as interactive research tools to facilitate collaboration between human coders and AI to effectively annotate online risk data at scale. Collaborative human-AI labeling is a promising approach to annotating large-scale and complex data for various tasks. Yet, tools and methods to support effective human-AI collaboration for data annotation are under-studied. This gap is pertinent because co-labeling tasks need to support a two-way interactive discussion that can add nuance and context, particularly in the context of online risk, which is highly subjective and contextualized. Therefore, we provide some of the early benefits and challenges of using LLMs-based tools for risk annotation and suggest future directions for the HCI research community to leverage LLMs as research tools to facilitate human-AI collaboration in contextualized online data annotation. Our research interests align very well with the purposes of the LLMs as Research Tools workshop to identify ongoing applications and challenges of using LLMs to work with data in HCI research. We anticipate learning valuable insights from organizers and participants into how LLMs can help reshape the HCI community's methods for working with data.
CLMar 16, 2025
Investigating Human-Aligned Large Language Model UncertaintyKyle Moore, Jesse Roberts, Daryl Watson et al.
Recent work has sought to quantify large language model uncertainty to facilitate model control and modulate user trust. Previous works focus on measures of uncertainty that are theoretically grounded or reflect the average overt behavior of the model. In this work, we investigate a variety of uncertainty measures, in order to identify measures that correlate with human group-level uncertainty. We find that Bayesian measures and a variation on entropy measures, top-k entropy, tend to agree with human behavior as a function of model size. We find that some strong measures decrease in human-similarity with model size, but, by multiple linear regression, we find that combining multiple uncertainty measures provide comparable human-alignment with reduced size-dependency.
HCJul 7, 2021
A Framework of High-Stakes Algorithmic Decision-Making for the Public Sector Developed through a Case Study of Child-WelfareDevansh Saxena, Karla Badillo-Urquiola, Pamela Wisniewski et al.
Algorithms have permeated throughout civil government and society, where they are being used to make high-stakes decisions about human lives. In this paper, we first develop a cohesive framework of algorithmic decision-making adapted for the public sector (ADMAPS) that reflects the complex socio-technical interactions between \textit{human discretion}, \textit{bureaucratic processes}, and \textit{algorithmic decision-making} by synthesizing disparate bodies of work in the fields of Human-Computer Interaction (HCI), Science and Technology Studies (STS), and Public Administration (PA). We then applied the ADMAPS framework to conduct a qualitative analysis of an in-depth, eight-month ethnographic case study of the algorithms in daily use within a child-welfare agency that serves approximately 900 families and 1300 children in the mid-western United States. Overall, we found there is a need to focus on strength-based algorithmic outcomes centered in social ecological frameworks. In addition, algorithmic systems need to support existing bureaucratic processes and augment human discretion, rather than replace it. Finally, collective buy-in in algorithmic systems requires trust in the target outcomes at both the practitioner and bureaucratic levels. As a result of our study, we propose guidelines for the design of high-stakes algorithmic decision-making tools in the child-welfare system, and more generally, in the public sector. We empirically validate the theoretically derived ADMAPS framework to demonstrate how it can be useful for systematically making pragmatic decisions about the design of algorithms for the public sector.