Palomino-Ochoa at SemEval-2020 Task 9: Robust System based on Transformer for Code-Mixed Sentiment Classification
This work provides a robust system for sentiment analysis in code-mixed Spanish-English social media text, which is a common challenge for natural language processing in multilingual communities.
This paper addresses the task of sentiment classification in Spanish-English code-mixed tweets. Their system, dplominop, achieved a weighted-F1 score of 0.755, ranking 4th out of 29 systems in the SemEval 2020 Task 9 Sentimix Spanglish test set.
We present a transfer learning system to perform a mixed Spanish-English sentiment classification task. Our proposal uses the state-of-the-art language model BERT and embed it within a ULMFiT transfer learning pipeline. This combination allows us to predict the polarity detection of code-mixed (English-Spanish) tweets. Thus, among 29 submitted systems, our approach (referred to as dplominop) is ranked 4th on the Sentimix Spanglish test set of SemEval 2020 Task 9. In fact, our system yields the weighted-F1 score value of 0.755 which can be easily reproduced -- the source code and implementation details are made available.