CLSep 15, 2016

Characterizing the Language of Online Communities and its Relation to Community Reception

arXiv:1609.04779v151 citations
AI Analysis

This work addresses the problem of understanding community dynamics in online platforms like Reddit, which is incremental as it builds on existing language modeling techniques.

The study examined how language style and topic in online communities relate to community identity and reception, finding that style is a better indicator of identity than topic and correlates positively with community reception, while topic does not.

This work investigates style and topic aspects of language in online communities: looking at both utility as an identifier of the community and correlation with community reception of content. Style is characterized using a hybrid word and part-of-speech tag n-gram language model, while topic is represented using Latent Dirichlet Allocation. Experiments with several Reddit forums show that style is a better indicator of community identity than topic, even for communities organized around specific topics. Further, there is a positive correlation between the community reception to a contribution and the style similarity to that community, but not so for topic similarity.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes