IRJan 23, 2022

Reinforcement Routing on Proximity Graph for Efficient Recommendation

arXiv:2201.09290v132 citations
Originality Incremental advance
AI Analysis

This work addresses the efficiency of recommendation systems by improving MIPS, a key computational bottleneck, though it appears incremental as it builds on existing graph-based methods.

The paper tackles the Maximum Inner Product Search (MIPS) problem, which is crucial for speeding up recommendation systems, by proposing a reinforcement learning model combined with imitation learning to automatically search on proximity graphs, achieving superior performance over state-of-the-art methods.

We focus on Maximum Inner Product Search (MIPS), which is an essential problem in many machine learning communities. Given a query, MIPS finds the most similar items with the maximum inner products. Methods for Nearest Neighbor Search (NNS) which is usually defined on metric space don't exhibit the satisfactory performance for MIPS problem since inner product is a non-metric function. However, inner products exhibit many good properties compared with metric functions, such as avoiding vanishing and exploding gradients. As a result, inner product is widely used in many recommendation systems, which makes efficient Maximum Inner Product Search a key for speeding up many recommendation systems. Graph based methods for NNS problem show the superiorities compared with other class methods. Each data point of the database is mapped to a node of the proximity graph. Nearest neighbor search in the database can be converted to route on the proximity graph to find the nearest neighbor for the query. This technique can be used to solve MIPS problem. Instead of searching the nearest neighbor for the query, we search the item with maximum inner product with query on the proximity graph. In this paper, we propose a reinforcement model to train an agent to search on the proximity graph automatically for MIPS problem if we lack the ground truths of training queries. If we know the ground truths of some training queries, our model can also utilize these ground truths by imitation learning to improve the agent's search ability. By experiments, we can see that our proposed mode which combines reinforcement learning with imitation learning shows the superiorities over the state-of-the-art methods

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes