Cross-Lingual Sentiment Analysis Without (Good) Translation
This provides an incremental improvement for researchers and practitioners in NLP by reducing reliance on accurate translation for sentiment analysis in low-resource languages.
The paper tackles cross-lingual sentiment analysis by using a single linear transformation with minimal word pairs to capture sentiment relationships, enabling analysis in non-English languages with scarce data at low cost.
Current approaches to cross-lingual sentiment analysis try to leverage the wealth of labeled English data using bilingual lexicons, bilingual vector space embeddings, or machine translation systems. Here we show that it is possible to use a single linear transformation, with as few as 2000 word pairs, to capture fine-grained sentiment relationships between words in a cross-lingual setting. We apply these cross-lingual sentiment models to a diverse set of tasks to demonstrate their functionality in a non-English context. By effectively leveraging English sentiment knowledge without the need for accurate translation, we can analyze and extract features from other languages with scarce data at a very low cost, thus making sentiment and related analyses for many languages inexpensive.