Machine Translation for Machines: the Sentiment Classification Use Case
This work addresses the challenge of enhancing translation quality for specific machine tasks, such as sentiment analysis, which is incremental as it builds on existing NMT methods with a new optimization focus.
The paper tackled the problem of improving machine translation for downstream NLP tasks by proposing a neural machine translation approach that generates translations optimized for machine-oriented criteria, specifically sentiment classification, resulting in classification results that outperform general-purpose NMT models and approximate gold standard accuracy on German and Italian Twitter data.
We propose a neural machine translation (NMT) approach that, instead of pursuing adequacy and fluency ("human-oriented" quality criteria), aims to generate translations that are best suited as input to a natural language processing component designed for a specific downstream task (a "machine-oriented" criterion). Towards this objective, we present a reinforcement learning technique based on a new candidate sampling strategy, which exploits the results obtained on the downstream task as weak feedback. Experiments in sentiment classification of Twitter data in German and Italian show that feeding an English classifier with machine-oriented translations significantly improves its performance. Classification results outperform those obtained with translations produced by general-purpose NMT models as well as by an approach based on reinforcement learning. Moreover, our results on both languages approximate the classification accuracy computed on gold standard English tweets.