Different Affordances on Facebook and SMS Text Messaging Do Not Impede Generalization of Language-Based Predictive Models
This addresses the challenge of deploying mobile health interventions by reducing the need for expensive SMS data collection, though it is incremental as it validates existing methods across similar domains.
The study investigated whether language models trained on Facebook data can generalize to SMS text messages for predicting user attributes like age, gender, and mental health, finding no significant performance differences in 6 out of 8 models, enabling more accurate just-in-time health interventions.
Adaptive mobile device-based health interventions often use machine learning models trained on non-mobile device data, such as social media text, due to the difficulty and high expense of collecting large text message (SMS) data. Therefore, understanding the differences and generalization of models between these platforms is crucial for proper deployment. We examined the psycho-linguistic differences between Facebook and text messages, and their impact on out-of-domain model performance, using a sample of 120 users who shared both. We found that users use Facebook for sharing experiences (e.g., leisure) and SMS for task-oriented and conversational purposes (e.g., plan confirmations), reflecting the differences in the affordances. To examine the downstream effects of these differences, we used pre-trained Facebook-based language models to estimate age, gender, depression, life satisfaction, and stress on both Facebook and SMS. We found no significant differences in correlations between the estimates and self-reports across 6 of 8 models. These results suggest using pre-trained Facebook language models to achieve better accuracy with just-in-time interventions.