CL AI SDAug 28, 2025

Exploring Machine Learning and Language Models for Multimodal Depression Detection

Javier Si Zhao Hong, Timothy Zoe Delaya, Sherwyn Chan Yin Kit, Pai Chet Ng, Xiaoxiao Miao

arXiv:2508.20805v12.72 citationsh-index: 1APSIPA

Originality Synthesis-oriented

AI Analysis

This work addresses depression detection for mental health applications, but it is incremental as it applies existing methods to a new challenge dataset.

The paper tackled multimodal depression detection by comparing XGBoost, transformers, and LLMs on audio, video, and text features, highlighting their strengths and limitations in capturing depression signals.

This paper presents our approach to the first Multimodal Personality-Aware Depression Detection Challenge, focusing on multimodal depression detection using machine learning and deep learning models. We explore and compare the performance of XGBoost, transformer-based architectures, and large language models (LLMs) on audio, video, and text features. Our results highlight the strengths and limitations of each type of model in capturing depression-related signals across modalities, offering insights into effective multimodal representation strategies for mental health prediction.

View on arXiv PDF

Similar