IRJan 23, 2022

Reinforcement Routing on Proximity Graph for Efficient Recommendation

Chao Feng, Defu Lian, Xiting Wang, Zheng liu, Xing Xie, Enhong Chen

arXiv:2201.09290v112.932 citations

Originality Incremental advance

AI Analysis

This work addresses the efficiency of recommendation systems by improving MIPS, a key computational bottleneck, though it appears incremental as it builds on existing graph-based methods.

The paper tackles the Maximum Inner Product Search (MIPS) problem, which is crucial for speeding up recommendation systems, by proposing a reinforcement learning model combined with imitation learning to automatically search on proximity graphs, achieving superior performance over state-of-the-art methods.

We focus on Maximum Inner Product Search (MIPS), which is an essential problem in many machine learning communities. Given a query, MIPS finds the most similar items with the maximum inner products. Methods for Nearest Neighbor Search (NNS) which is usually defined on metric space don't exhibit the satisfactory performance for MIPS problem since inner product is a non-metric function. However, inner products exhibit many good properties compared with metric functions, such as avoiding vanishing and exploding gradients. As a result, inner product is widely used in many recommendation systems, which makes efficient Maximum Inner Product Search a key for speeding up many recommendation systems. Graph based methods for NNS problem show the superiorities compared with other class methods. Each data point of the database is mapped to a node of the proximity graph. Nearest neighbor search in the database can be converted to route on the proximity graph to find the nearest neighbor for the query. This technique can be used to solve MIPS problem. Instead of searching the nearest neighbor for the query, we search the item with maximum inner product with query on the proximity graph. In this paper, we propose a reinforcement model to train an agent to search on the proximity graph automatically for MIPS problem if we lack the ground truths of training queries. If we know the ground truths of some training queries, our model can also utilize these ground truths by imitation learning to improve the agent's search ability. By experiments, we can see that our proposed mode which combines reinforcement learning with imitation learning shows the superiorities over the state-of-the-art methods

View on arXiv PDF

Similar