IRLGNov 2, 2020

Cross-Lingual Document Retrieval with Smooth Learning

arXiv:2011.00701v1994 citations
AI Analysis

This addresses the problem of unreliable cross-lingual search for users needing information across language barriers, representing an incremental improvement with specific technical contributions.

The paper tackled the instability of neural models in cross-lingual document retrieval by proposing a robust framework with smooth cosine similarity and a novel loss function, achieving significant gains in ranking metrics across various languages.

Cross-lingual document search is an information retrieval task in which the queries' language differs from the documents' language. In this paper, we study the instability of neural document search models and propose a novel end-to-end robust framework that achieves improved performance in cross-lingual search with different documents' languages. This framework includes a novel measure of the relevance, smooth cosine similarity, between queries and documents, and a novel loss function, Smooth Ordinal Search Loss, as the objective. We further provide theoretical guarantee on the generalization error bound for the proposed framework. We conduct experiments to compare our approach with other document search models, and observe significant gains under commonly used ranking metrics on the cross-lingual document retrieval task in a variety of languages.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes