Deep Learning Brasil -- NLP at SemEval-2020 Task 9: Overview of Sentiment Analysis of Code-Mixed Tweets
This work addresses sentiment analysis for code-mixed social media data, but it is incremental as it applies existing ensemble methods to a specific dataset.
The paper tackled sentiment analysis of Hindi-English code-mixed tweets by developing an ensemble of four models (MultiFiT, BERT, ALBERT, and XLNET), achieving an F1 score of 72.7% in the SemEval-2020 Task 9 challenge.
In this paper, we describe a methodology to predict sentiment in code-mixed tweets (hindi-english). Our team called verissimo.manoel in CodaLab developed an approach based on an ensemble of four models (MultiFiT, BERT, ALBERT, and XLNET). The final classification algorithm was an ensemble of some predictions of all softmax values from these four models. This architecture was used and evaluated in the context of the SemEval 2020 challenge (task 9), and our system got 72.7% on the F1 score.