Learning to Translate for Multilingual Question Answering
This addresses the challenge of handling multiple languages in question answering systems, though it appears incremental by building on existing translation methods.
The paper tackled the problem of multilingual question answering by exploring translation directions and methods, and introduced a learn-to-translate approach that outperformed a strong baseline with statistical significance (p<0.05) on a dataset in English, Arabic, and Chinese.
In multilingual question answering, either the question needs to be translated into the document language, or vice versa. In addition to direction, there are multiple methods to perform the translation, four of which we explore in this paper: word-based, 10-best, context-based, and grammar-based. We build a feature for each combination of translation direction and method, and train a model that learns optimal feature weights. On a large forum dataset consisting of posts in English, Arabic, and Chinese, our novel learn-to-translate approach was more effective than a strong baseline (p<0.05): translating all text into English, then training a classifier based only on English (original or translated) text.