Instance-based Transfer Learning for Multilingual Deep Retrieval
This addresses the challenge of improving search performance in multilingual settings, showing incremental gains through transfer learning.
The paper tackled the problem of multilingual search by applying instance-based transfer learning to next-sentence prediction and inverse cloze tasks, achieving positive transfer across all 35 target languages tested.
We focus on the problem of search in the multilingual setting. Examining the problems of next-sentence prediction and inverse cloze, we show that at large scale, instance-based transfer learning is surprisingly effective in the multilingual setting, leading to positive transfer on all of the 35 target languages and two tasks tested. We analyze this improvement and argue that the most natural explanation, namely direct vocabulary overlap between languages, only partially explains the performance gains: in fact, we demonstrate target-language improvement can occur after adding data from an auxiliary language even with no vocabulary in common with the target. This surprising result is due to the effect of transitive vocabulary overlaps between pairs of auxiliary and target languages.