Text Classification of Manifestos and COVID-19 Press Briefings using BERT and Convolutional Neural Networks
This work addresses political discourse analysis for researchers, but it is incremental as it applies existing methods to new data.
The authors tackled the problem of sentence-level political discourse classification by training a CNN classifier on annotated political manifestos and applying it to COVID-19 press briefings, showing that CNN combined with BERT outperforms other embeddings and enables automatic classification without additional training.
We build a sentence-level political discourse classifier using existing human expert annotated corpora of political manifestos from the Manifestos Project (Volkens et al., 2020a) and applying them to a corpus ofCOVID-19Press Briefings (Chatsiou, 2020). We use manually annotated political manifestos as training data to train a local topic ConvolutionalNeural Network (CNN) classifier; then apply it to the COVID-19PressBriefings Corpus to automatically classify sentences in the test corpus.We report on a series of experiments with CNN trained on top of pre-trained embeddings for sentence-level classification tasks. We show thatCNN combined with transformers like BERT outperforms CNN combined with other embeddings (Word2Vec, Glove, ELMo) and that it is possible to use a pre-trained classifier to conduct automatic classification on different political texts without additional training.