CLApr 3, 2018

Sentiment Analysis of Code-Mixed Languages leveraging Resource Rich Languages

Nurendra Choudhary, Rajat Singh, Ishita Bindlish, Manish Shrivastava

arXiv:1804.00806v11.660 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of processing code-mixed text in NLP, which is important for multilingual communities, but it appears incremental as it builds on existing contrastive learning and siamese network techniques.

The paper tackles sentiment analysis for code-mixed languages by proposing SACMT, a novel approach using contrastive learning and siamese networks to map code-mixed and standard languages to a common sentiment space, achieving 7.6% higher accuracy and 10.1% higher F-score than state-of-the-art methods.

Code-mixed data is an important challenge of natural language processing because its characteristics completely vary from the traditional structures of standard languages. In this paper, we propose a novel approach called Sentiment Analysis of Code-Mixed Text (SACMT) to classify sentences into their corresponding sentiment - positive, negative or neutral, using contrastive learning. We utilize the shared parameters of siamese networks to map the sentences of code-mixed and standard languages to a common sentiment space. Also, we introduce a basic clustering based preprocessing method to capture variations of code-mixed transliterated words. Our experiments reveal that SACMT outperforms the state-of-the-art approaches in sentiment analysis for code-mixed text by 7.6% in accuracy and 10.1% in F-score.

View on arXiv PDF

Similar