IRLGFeb 3, 2019

Value-aware Recommendation based on Reinforced Profit Maximization in E-commerce Systems

arXiv:1902.00851v118 citations
Originality Highly original
AI Analysis

This work addresses the need for commercial recommendation systems to directly boost revenue, offering a novel approach that bridges gaps in existing methods focused on accuracy metrics.

The paper tackles the problem of aligning recommendation systems with profit maximization in e-commerce by proposing a value-aware framework that integrates economic values of user actions and uses reinforcement learning to optimize recommendations, achieving improved performance in both traditional ranking tasks and economic profits.

Existing recommendation algorithms mostly focus on optimizing traditional recommendation measures, such as the accuracy of rating prediction in terms of RMSE or the quality of top-$k$ recommendation lists in terms of precision, recall, MAP, etc. However, an important expectation for commercial recommendation systems is to improve the final revenue/profit of the system. Traditional recommendation targets such as rating prediction and top-$k$ recommendation are not directly related to this goal. In this work, we blend the fundamental concepts in online advertising and micro-economics into personalized recommendation for profit maximization. Specifically, we propose value-aware recommendation based on reinforcement learning, which directly optimizes the economic value of candidate items to generate the recommendation list. In particular, we generalize the basic concept of click conversion rate (CVR) in computational advertising into the conversation rate of an arbitrary user action (XVR) in E-commerce, where the user actions can be clicking, adding to cart, adding to wishlist, etc. In this way, each type of user action is mapped to its monetized economic value. Economic values of different user actions are further integrated as the reward of a ranking list, and reinforcement learning is used to optimize the recommendation list for the maximum total value. Experimental results in both offline benchmarks and online commercial systems verified the improved performance of our framework, in terms of both traditional top-$k$ ranking tasks and the economic profits of the system.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes