CLCYSIMar 28, 2016

Longitudinal Analysis of Discussion Topics in an Online Breast Cancer Community using Convolutional Neural Networks

arXiv:1603.08458v360 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of topic analysis in online health communities for researchers and practitioners, but it is incremental as it applies existing CNN methods to a new domain-specific dataset.

The authors tackled the problem of identifying discussion topics in online health communities by developing a multi-class schema, annotated dataset, and supervised classifiers, including a convolutional neural network (CNN), and applied it to a breast cancer community for longitudinal analysis. Their results show that CNN outperforms other classifiers in topic classification and detects certain topic change trajectories.

Identifying topics of discussions in online health communities (OHC) is critical to various applications, but can be difficult because topics of OHC content are usually heterogeneous and domain-dependent. In this paper, we provide a multi-class schema, an annotated dataset, and supervised classifiers based on convolutional neural network (CNN) and other models for the task of classifying discussion topics. We apply the CNN classifier to the most popular breast cancer online community, and carry out a longitudinal analysis to show topic distributions and topic changes throughout members' participation. Our experimental results suggest that CNN outperforms other classifiers in the task of topic classification, and that certain trajectories can be detected with respect to topic changes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes