Deep Retrieval: Learning A Retrievable Structure for Large-Scale Recommendations
This addresses the core efficiency and accuracy challenge in industrial recommendation systems at scale, representing a non-incremental advancement by deploying a novel non-ANN method for hundreds of millions of items.
The paper tackles the problem of efficiently retrieving top candidates in large-scale recommendations by introducing Deep Retrieval (DR), which learns a retrievable structure directly from user-item interactions without Euclidean space assumptions, achieving sub-linear computational complexity and matching brute-force accuracy on public datasets while significantly outperforming ANN baselines in a live production system.
One of the core problems in large-scale recommendations is to retrieve top relevant candidates accurately and efficiently, preferably in sub-linear time. Previous approaches are mostly based on a two-step procedure: first learn an inner-product model, and then use some approximate nearest neighbor (ANN) search algorithm to find top candidates. In this paper, we present Deep Retrieval (DR), to learn a retrievable structure directly with user-item interaction data (e.g. clicks) without resorting to the Euclidean space assumption in ANN algorithms. DR's structure encodes all candidate items into a discrete latent space. Those latent codes for the candidates are model parameters and learnt together with other neural network parameters to maximize the same objective function. With the model learnt, a beam search over the structure is performed to retrieve the top candidates for reranking. Empirically, we first demonstrate that DR, with sub-linear computational complexity, can achieve almost the same accuracy as the brute-force baseline on two public datasets. Moreover, we show that, in a live production recommendation system, a deployed DR approach significantly outperforms a well-tuned ANN baseline in terms of engagement metrics. To the best of our knowledge, DR is among the first non-ANN algorithms successfully deployed at the scale of hundreds of millions of items for industrial recommendation systems.