Neural document expansion for ad-hoc information retrieval
This work addresses the challenge of applying neural document expansion to standard IR tasks, which often lack sufficient in-domain training data and contain long documents, benefiting researchers and practitioners in information retrieval.
The authors adapted a neural Seq2Seq document expansion model to standard information retrieval tasks, which typically involve scarce labels and long documents. They demonstrated that this adaptation can be effective despite these challenges.
Recently, Nogueira et al. [2019] proposed a new approach to document expansion based on a neural Seq2Seq model, showing significant improvement on short text retrieval task. However, this approach needs a large amount of in-domain training data. In this paper, we show that this neural document expansion approach can be effectively adapted to standard IR tasks, where labels are scarce and many long documents are present.