When Consistency Becomes Bias: Interviewer Effects in Semi-Structured Clinical Interviews
This work addresses a critical bias issue in mental health AI for researchers and clinicians, revealing that current models may not learn genuine linguistic cues from patients, which is incremental as it builds on existing detection methods.
The study tackled the problem of systematic bias from interviewer prompts in automatic depression detection from semi-structured clinical interviews, finding that models trained on interviewer turns exploit fixed prompts to achieve high classification scores without using participant language, with performance inflation observed across three datasets.
Automatic depression detection from doctor-patient conversations has gained momentum thanks to the availability of public corpora and advances in language modeling. However, interpretability remains limited: strong performance is often reported without revealing what drives predictions. We analyze three datasets: ANDROIDS, DAIC-WOZ, E-DAIC and identify a systematic bias from interviewer prompts in semi-structured interviews. Models trained on interviewer turns exploit fixed prompts and positions to distinguish depressed from control subjects, often achieving high classification scores without using participant language. Restricting models to participant utterances distributes decision evidence more broadly and reflects genuine linguistic cues. While semi-structured protocols ensure consistency, including interviewer prompts inflates performance by leveraging script artifacts. Our results highlight a cross-dataset, architecture-agnostic bias and emphasize the need for analyses that localize decision evidence by time and speaker to ensure models learn from participants' language.