IRCLJun 17, 2023

Typo-Robust Representation Learning for Dense Retrieval

arXiv:2306.10348v1224 citationsh-index: 20
Originality Incremental advance
AI Analysis

This addresses a real-world challenge in information retrieval for users dealing with typos, but it is incremental as it builds on existing alignment approaches.

The paper tackled the problem of handling misspelled queries in dense retrieval by improving representation learning to align misspelled queries with pristine ones and enhance contrast with surrounding queries, resulting in outperforming existing competitors on two benchmark datasets with misspelled queries.

Dense retrieval is a basic building block of information retrieval applications. One of the main challenges of dense retrieval in real-world settings is the handling of queries containing misspelled words. A popular approach for handling misspelled queries is minimizing the representations discrepancy between misspelled queries and their pristine ones. Unlike the existing approaches, which only focus on the alignment between misspelled and pristine queries, our method also improves the contrast between each misspelled query and its surrounding queries. To assess the effectiveness of our proposed method, we compare it against the existing competitors using two benchmark datasets and two base encoders. Our method outperforms the competitors in all cases with misspelled queries. Our code and models are available at https://github. com/panuthept/DST-DenseRetrieval.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes