Towards Detection of Subjective Bias using Contextualized Word Embeddings
This work addresses subjective bias detection for applications like propaganda detection and content recommendation, but it is incremental as it builds on existing BERT models with ensemble techniques.
The paper tackled the problem of detecting subjective bias in natural language, such as inflammatory words or presupposed truths, using BERT-based models on the Wiki Neutrality Corpus with 360k labeled instances, and achieved a result of outperforming state-of-the-art methods like BERT-large by a margin of 5.6 F1 score.
Subjective bias detection is critical for applications like propaganda detection, content recommendation, sentiment analysis, and bias neutralization. This bias is introduced in natural language via inflammatory words and phrases, casting doubt over facts, and presupposing the truth. In this work, we perform comprehensive experiments for detecting subjective bias using BERT-based models on the Wiki Neutrality Corpus(WNC). The dataset consists of $360k$ labeled instances, from Wikipedia edits that remove various instances of the bias. We further propose BERT-based ensembles that outperform state-of-the-art methods like $BERT_{large}$ by a margin of $5.6$ F1 score.