CLAug 25, 2020

The Impact of Indirect Machine Translation on Sentiment Classification

Alberto Poncelas, Pintu Lohar, Andy Way, James Hadley

arXiv:2008.11257v131.0997 citations

Originality Synthesis-oriented

AI Analysis

This addresses data scarcity in sentiment classification for low-resource languages or domains, but is incremental as it builds on existing translation and classification methods.

The study investigated how machine translation, including indirect translation via a pivot language, affects sentiment classification performance on customer feedback, analyzing when translation improves or harms classifier accuracy.

Sentiment classification has been crucial for many natural language processing (NLP) applications, such as the analysis of movie reviews, tweets, or customer feedback. A sufficiently large amount of data is required to build a robust sentiment classification system. However, such resources are not always available for all domains or for all languages. In this work, we propose employing a machine translation (MT) system to translate customer feedback into another language to investigate in which cases translated sentences can have a positive or negative impact on an automatic sentiment classifier. Furthermore, as performing a direct translation is not always possible, we explore the performance of automatic classifiers on sentences that have been translated using a pivot MT system. We conduct several experiments using the above approaches to analyse the performance of our proposed sentiment classification system and discuss the advantages and drawbacks of classifying translated sentences.

View on arXiv PDF

Similar