CLLGSIMay 16, 2017

Social Media-based Substance Use Prediction

arXiv:1705.05633v2
Originality Synthesis-oriented
AI Analysis

This work addresses substance use prediction for public health applications, but it is incremental as it applies known techniques to a specific domain with improved performance.

The paper tackled the problem of predicting substance use from social media data by leveraging unsupervised feature learning and multi-view integration, achieving AUC scores of 86% for tobacco, 81% for alcohol, and 84% for drug use, which significantly outperformed existing methods.

In this paper, we demonstrate how the state-of-the-art machine learning and text mining techniques can be used to build effective social media-based substance use detection systems. Since a substance use ground truth is difficult to obtain on a large scale, to maximize system performance, we explore different feature learning methods to take advantage of a large amount of unsupervised social media data. We also demonstrate the benefit of using multi-view unsupervised feature learning to combine heterogeneous user information such as Facebook `"likes" and "status updates" to enhance system performance. Based on our evaluation, our best models achieved 86% AUC for predicting tobacco use, 81% for alcohol use and 84% for drug use, all of which significantly outperformed existing methods. Our investigation has also uncovered interesting relations between a user's social media behavior (e.g., word usage) and substance use.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes