CLOct 9, 2025

Two-Stage Voting for Robust and Efficient Suicide Risk Detection on Social Media

Yukai Song, Pengfei Zhou, César Escobar-Viera, Candice Biernesser, Wei Huang, Jingtong Hu

arXiv:2510.08365v12.7h-index: 2

Originality Incremental advance

AI Analysis

This work addresses suicide risk detection for at-risk individuals on social media, offering an incremental improvement by combining existing methods for better performance and efficiency.

The paper tackled the challenge of detecting implicit suicidal ideation on social media by proposing a two-stage voting architecture that balances efficiency and robustness, achieving 98.0% F1 on explicit cases and 99.7% on implicit ones while reducing computational costs.

Suicide rates have risen worldwide in recent years, underscoring the urgent need for proactive prevention strategies. Social media provides valuable signals, as many at-risk individuals - who often avoid formal help due to stigma - choose instead to share their distress online. Yet detecting implicit suicidal ideation, conveyed indirectly through metaphor, sarcasm, or subtle emotional cues, remains highly challenging. Lightweight models like BERT handle explicit signals but fail on subtle implicit ones, while large language models (LLMs) capture nuance at prohibitive computational cost. To address this gap, we propose a two-stage voting architecture that balances efficiency and robustness. In Stage 1, a lightweight BERT classifier rapidly resolves high-confidence explicit cases. In Stage 2, ambiguous inputs are escalated to either (i) a multi-perspective LLM voting framework to maximize recall on implicit ideation, or (ii) a feature-based ML ensemble guided by psychologically grounded indicators extracted via prompt-engineered LLMs for efficiency and interpretability. To the best of our knowledge, this is among the first works to operationalize LLM-extracted psychological features as structured vectors for suicide risk detection. On two complementary datasets - explicit-dominant Reddit and implicit-only DeepSuiMind - our framework outperforms single-model baselines, achieving 98.0% F1 on explicit cases, 99.7% on implicit ones, and reducing the cross-domain gap below 2%, while significantly lowering LLM cost.

View on arXiv PDF

Similar