CLApr 20, 2023

Domain-specific Continued Pretraining of Language Models for Capturing Long Context in Mental Health

Shaoxiong Ji, Tianlin Zhang, Kailai Yang, Sophia Ananiadou, Erik Cambria, Jörg Tiedemann

arXiv:2304.10447v17.842 citationsh-index: 113

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of handling long social media posts for mental health detection, but it is incremental as it adapts existing models to a specific domain.

The paper tackled the lack of domain-specific pretrained models for long-sequence modeling in mental health by conducting continued pretraining on XLNet and Longformer, resulting in the release of MentalXLNet and MentalLongformer, which were evaluated for mental health classification and long-range ability.

Pretrained language models have been used in various natural language processing applications. In the mental health domain, domain-specific language models are pretrained and released, which facilitates the early detection of mental health conditions. Social posts, e.g., on Reddit, are usually long documents. However, there are no domain-specific pretrained models for long-sequence modeling in the mental health domain. This paper conducts domain-specific continued pretraining to capture the long context for mental health. Specifically, we train and release MentalXLNet and MentalLongformer based on XLNet and Longformer. We evaluate the mental health classification performance and the long-range ability of these two domain-specific pretrained models. Our models are released in HuggingFace.

View on arXiv PDF

Similar