Domain Adaptation of Transformer-Based Models using Unlabeled Data for Relevance and Polarity Classification of German Customer Feedback
This work addresses the need for companies to efficiently process German customer feedback, though it is incremental as it adapts existing transformer methods to a specific domain.
The study tackled the problem of analyzing German customer feedback by applying transformer-based models to relevance and polarity classification, achieving micro-averaged F1-scores up to 96.1% for relevance and 85.3% for polarity, outperforming baselines and previous models.
Understanding customer feedback is becoming a necessity for companies to identify problems and improve their products and services. Text classification and sentiment analysis can play a major role in analyzing this data by using a variety of machine and deep learning approaches. In this work, different transformer-based models are utilized to explore how efficient these models are when working with a German customer feedback dataset. In addition, these pre-trained models are further analyzed to determine if adapting them to a specific domain using unlabeled data can yield better results than off-the-shelf pre-trained models. To evaluate the models, two downstream tasks from the GermEval 2017 are considered. The experimental results show that transformer-based models can reach significant improvements compared to a fastText baseline and outperform the published scores and previous models. For the subtask Relevance Classification, the best models achieve a micro-averaged $F1$-Score of 96.1 % on the first test set and 95.9 % on the second one, and a score of 85.1 % and 85.3 % for the subtask Polarity Classification.