88.9HCMay 5
User Detection and Response Patterns of Sycophantic Behavior in Conversational AIKazi Noshin, Syed Ishtiaque Ahmed, Sharifa Sultana
Despite growing attention to LLM sycophancy from researchers and developers, users' own experiences of this behavior remain underexplored. We examine how everyday users experience AI sycophancy through Reddit discussions. Using our ODR Framework which maps user experiences through observation, detection, and response stages, we find that users identify sycophantic behavior through methods like cross-platform comparison and consistency testing. They employ various mitigation strategies, including persona-based prompting and specific language engineering techniques. Our findings suggest that sycophancy does not have a uniformly negative effect; its impact differs by context. Users facing trauma, mental health struggles, or isolation often actively seek affirmative AI responses for emotional support. Users construct both technical and informal theories to explain sycophantic outputs. Users construct both technical and informal theories to explain sycophantic outputs. These findings suggest eliminating sycophancy entirely may be misguided. We argue for context-aware AI design that balances risks against benefits of affirmative interaction, with implications for user education and system transparency.
IVAug 3, 2023
Unmasking Parkinson's Disease with Smile: An AI-enabled Screening FrameworkTariq Adnan, Md Saiful Islam, Wasifur Rahman et al.
We present an efficient and accessible PD screening method by leveraging AI-driven models enabled by the largest video dataset of facial expressions from 1,059 unique participants. This dataset includes 256 individuals with PD, 165 clinically diagnosed, and 91 self-reported. Participants used webcams to record themselves mimicking three facial expressions (smile, disgust, and surprise) from diverse sources encompassing their homes across multiple countries, a US clinic, and a PD wellness center in the US. Facial landmarks are automatically tracked from the recordings to extract features related to hypomimia, a prominent PD symptom characterized by reduced facial expressions. Machine learning algorithms are trained on these features to distinguish between individuals with and without PD. The model was tested for generalizability on external (unseen during training) test videos collected from a US clinic and Bangladesh. An ensemble of machine learning models trained on smile videos achieved an accuracy of 87.9+-0.1% (95% Confidence Interval) with an AUROC of 89.3+-0.3% as evaluated on held-out data (using k-fold cross-validation). In external test settings, the ensemble model achieved 79.8+-0.6% accuracy with 81.9+-0.3% AUROC on the clinical test set and 84.9+-0.4% accuracy with 81.2+-0.6% AUROC on participants from Bangladesh. In every setting, the model was free from detectable bias across sex and ethnic subgroups, except in the cohorts from Bangladesh, where the model performed significantly better for female participants than males. Smiling videos can effectively differentiate between individuals with and without PD, offering a potentially easy, accessible, and cost-efficient way to screen for PD, especially when a clinical diagnosis is difficult to access.
75.4HCMar 22
The Illusion of Agreement with ChatGPT: Sycophancy and BeyondKazi Noshin, Sharifa Sultana
While concerns about ChatGPT-induced harms due to sycophancy and other behaviors, including gaslighting, have grown among researchers, how users themselves experience and mitigate these harms remain largely underexplored. We analyze Reddit discussions to investigate what concerns users report and how they address them. Our findings reveal five distinct user-reported concerns that manifest across multiple life domains, ranging from personal to societal: inducing delusion, digressing narratives, implicating users for models' limitations, inducing addiction, and providing unsupervised psychological support. We document three-tier user-driven suggestions spanning functional usage techniques, behavioral approaches, and private and institutional safeguards. Our findings show that AI-induced harms require coordinated interventions across users, developers, and policymakers. We discuss design implications and future directions to mitigate the harms and ensure user benefits.