A Novel BERT-based Classifier to Detect Political Leaning of YouTube Videos based on their Titles
This addresses the need for automated political content analysis on YouTube, where a quarter of US adults get news, but it is incremental as it applies an existing method (BERT) to a new domain.
The authors tackled the problem of identifying political leaning from YouTube video titles, proposing a BERT-based classifier that achieved 75% accuracy and 77% F1 score on a dataset of 10 million titles.
A quarter of US adults regularly get their news from YouTube. Yet, despite the massive political content available on the platform, to date no classifier has been proposed to identify the political leaning of YouTube videos. To fill this gap, we propose a novel classifier based on Bert -- a language model from Google -- to classify YouTube videos merely based on their titles into six categories, namely: Far Left, Left, Center, Anti-Woke, Right, and Far Right. We used a public dataset of 10 million YouTube video titles (under various categories) to train and validate the proposed classifier. We compare the classifier against several alternatives that we trained on the same dataset, revealing that our classifier achieves the highest accuracy (75%) and the highest F1 score (77%). To further validate the classification performance, we collect videos from YouTube channels of numerous prominent news agencies, such as Fox News and New York Times, which have widely known political leanings, and apply our classifier to their video titles. For the vast majority of cases, the predicted political leaning matches that of the news agency.