HLATR: Enhance Multi-stage Text Retrieval with Hybrid List Aware Transformer Reranking
This work addresses a gap in multi-stage text retrieval for researchers and practitioners by directly coupling multi-stage features, though it is incremental as it builds on existing retrieve-then-rerank architectures.
The paper tackles the problem of optimizing multi-stage text retrieval systems by introducing HLATR, a hybrid list-aware transformer reranking module that incorporates features from both retrieval and reranking stages, resulting in improved ranking performance on two large-scale datasets.
Deep pre-trained language models (e,g. BERT) are effective at large-scale text retrieval task. Existing text retrieval systems with state-of-the-art performance usually adopt a retrieve-then-reranking architecture due to the high computational cost of pre-trained language models and the large corpus size. Under such a multi-stage architecture, previous studies mainly focused on optimizing single stage of the framework thus improving the overall retrieval performance. However, how to directly couple multi-stage features for optimization has not been well studied. In this paper, we design Hybrid List Aware Transformer Reranking (HLATR) as a subsequent reranking module to incorporate both retrieval and reranking stage features. HLATR is lightweight and can be easily parallelized with existing text retrieval systems so that the reranking process can be performed in a single yet efficient processing. Empirical experiments on two large-scale text retrieval datasets show that HLATR can efficiently improve the ranking performance of existing multi-stage text retrieval methods.