InterPrompt: Interpretable Prompting for Interrelated Interpersonal Risk Factors in Reddit Posts
This work addresses mental health professionals' need for automated, interpretable tools to identify risk factors in online narratives, though it is incremental as it builds on existing GPT-3 fine-tuning methods.
The paper tackles the detection of interpersonal risk factors (Thwarted Belongingness and Perceived Burdensomeness) in Reddit posts for early mental health disorder screening, using an interpretable prompting method with GPT-3, resulting in improved classification and explanation generation compared to baselines.
Mental health professionals and clinicians have observed the upsurge of mental disorders due to Interpersonal Risk Factors (IRFs). To simulate the human-in-the-loop triaging scenario for early detection of mental health disorders, we recognized textual indications to ascertain these IRFs : Thwarted Belongingness (TBe) and Perceived Burdensomeness (PBu) within personal narratives. In light of this, we use N-shot learning with GPT-3 model on the IRF dataset, and underscored the importance of fine-tuning GPT-3 model to incorporate the context-specific sensitivity and the interconnectedness of textual cues that represent both IRFs. In this paper, we introduce an Interpretable Prompting (InterPrompt)} method to boost the attention mechanism by fine-tuning the GPT-3 model. This allows a more sophisticated level of language modification by adjusting the pre-trained weights. Our model learns to detect usual patterns and underlying connections across both the IRFs, which leads to better system-level explainability and trustworthiness. The results of our research demonstrate that all four variants of GPT-3 model, when fine-tuned with InterPrompt, perform considerably better as compared to the baseline methods, both in terms of classification and explanation generation.