What the Language You Tweet Says About Your Occupation
This work addresses the problem of inferring occupations from social media data, which could be useful for targeted advertising or social analysis, but it is incremental as it applies existing methods to a new dataset.
The paper investigated how language in tweets correlates with occupations, finding that people in different jobs exhibit unique linguistic styles and personality traits, and built a classifier that achieved high accuracy for job prediction.
Many aspects of people's lives are proven to be deeply connected to their jobs. In this paper, we first investigate the distinct characteristics of major occupation categories based on tweets. From multiple social media platforms, we gather several types of user information. From users' LinkedIn webpages, we learn their proficiencies. To overcome the ambiguity of self-reported information, a soft clustering approach is applied to extract occupations from crowd-sourced data. Eight job categories are extracted, including Marketing, Administrator, Start-up, Editor, Software Engineer, Public Relation, Office Clerk, and Designer. Meanwhile, users' posts on Twitter provide cues for understanding their linguistic styles, interests, and personalities. Our results suggest that people of different jobs have unique tendencies in certain language styles and interests. Our results also clearly reveal distinctive levels in terms of Big Five Traits for different jobs. Finally, a classifier is built to predict job types based on the features extracted from tweets. A high accuracy indicates a strong discrimination power of language features for job prediction task.