CLJul 15, 2025

DS@GT at eRisk 2025: From prompts to predictions, benchmarking early depression detection with conversational agent based assessments and temporal attention models

Anthony Miyaguchi, David Guecha, Yuwen Chiu, Sidharth Gaur

arXiv:2507.10958v12.71 citationsh-index: 4CLEF

Originality Synthesis-oriented

AI Analysis

This work addresses depression screening through conversational AI, but it is incremental as it applies existing methods to a new benchmark without ground-truth labels.

The paper tackled early depression detection by using prompt-engineered LLMs to conduct BDI-II-based assessments from conversations, achieving second place on the leaderboard with metrics like DCHR=0.50 and ADODL=0.89.

This Working Note summarizes the participation of the DS@GT team in two eRisk 2025 challenges. For the Pilot Task on conversational depression detection with large language-models (LLMs), we adopted a prompt-engineering strategy in which diverse LLMs conducted BDI-II-based assessments and produced structured JSON outputs. Because ground-truth labels were unavailable, we evaluated cross-model agreement and internal consistency. Our prompt design methodology aligned model outputs with BDI-II criteria and enabled the analysis of conversational cues that influenced the prediction of symptoms. Our best submission, second on the official leaderboard, achieved DCHR = 0.50, ADODL = 0.89, and ASHR = 0.27.

View on arXiv PDF

Similar