CLMar 17, 2022

Multilingual Detection of Personal Employment Status on Twitter

MITOxford
arXiv:2203.09178v1640 citationsh-index: 23
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of identifying rare personal disclosures for applications like job matching and social protection, but it is incremental as it applies existing methods to a new domain.

The study tackled the problem of detecting personal employment status disclosures on Twitter in multilingual settings with extreme class imbalance, using Active Learning strategies with BERT-based models, and found that a small number of AL iterations significantly improved precision, recall, and diversity compared to a supervised baseline.

Detecting disclosures of individuals' employment status on social media can provide valuable information to match job seekers with suitable vacancies, offer social protection, or measure labor market flows. However, identifying such personal disclosures is a challenging task due to their rarity in a sea of social media content and the variety of linguistic forms used to describe them. Here, we examine three Active Learning (AL) strategies in real-world settings of extreme class imbalance, and identify five types of disclosures about individuals' employment status (e.g. job loss) in three languages using BERT-based classification models. Our findings show that, even under extreme imbalance settings, a small number of AL iterations is sufficient to obtain large and significant gains in precision, recall, and diversity of results compared to a supervised baseline with the same number of labels. We also find that no AL strategy consistently outperforms the rest. Qualitative analysis suggests that AL helps focus the attention mechanism of BERT on core terms and adjust the boundaries of semantic expansion, highlighting the importance of interpretable models to provide greater control and visibility into this dynamic learning process.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes