CVAIMMMay 9, 2024

LMVD: A Large-Scale Multimodal Vlog Dataset for Depression Detection in the Wild

arXiv:2407.00024v127 citationsHas CodeInf Fusion
Originality Incremental advance
AI Analysis

This work addresses data scarcity for depression detection in affective computing, though it is incremental as it builds on existing multimodal approaches.

The authors tackled the problem of scarce data for depression detection by building a large-scale multimodal vlog dataset (LMVD) with 1823 samples from 1475 participants, and proposed a novel MDDformer architecture that demonstrated superior performance for depression detection.

Depression can significantly impact many aspects of an individual's life, including their personal and social functioning, academic and work performance, and overall quality of life. Many researchers within the field of affective computing are adopting deep learning technology to explore potential patterns related to the detection of depression. However, because of subjects' privacy protection concerns, that data in this area is still scarce, presenting a challenge for the deep discriminative models used in detecting depression. To navigate these obstacles, a large-scale multimodal vlog dataset (LMVD), for depression recognition in the wild is built. In LMVD, which has 1823 samples with 214 hours of the 1475 participants captured from four multimedia platforms (Sina Weibo, Bilibili, Tiktok, and YouTube). A novel architecture termed MDDformer to learn the non-verbal behaviors of individuals is proposed. Extensive validations are performed on the LMVD dataset, demonstrating superior performance for depression detection. We anticipate that the LMVD will contribute a valuable function to the depression detection community. The data and code will released at the link: https://github.com/helang818/LMVD/.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes