R$^2$ec: Towards Large Recommender Models with Reasoning
This work addresses the challenge of making recommender systems more interpretable and efficient for users and developers, though it is incremental in building on existing LLM-based methods.
The paper tackles the problem of integrating reasoning capabilities into large recommender models by proposing R^2ec, a unified model with a dual-head architecture for reasoning chain generation and item prediction, which reduces inference latency and outperforms various baselines on three datasets.
Large recommender models have extended LLMs as powerful recommenders via encoding or item generation, and recent breakthroughs in LLM reasoning synchronously motivate the exploration of reasoning in recommendation. In this work, we propose R$^2$ec, a unified large recommender model with intrinsic reasoning capability. R$^2$ec introduces a dual-head architecture that supports both reasoning chain generation and efficient item prediction in a single model, significantly reducing inference latency. To overcome the lack of annotated reasoning data, we design RecPO, a reinforcement learning framework that optimizes reasoning and recommendation jointly with a novel fused reward mechanism. Extensive experiments on three datasets demonstrate that R$^2$ec outperforms traditional, LLM-based, and reasoning-augmented recommender baselines, while further analyses validate its competitive efficiency among conventional LLM-based recommender baselines and strong adaptability to diverse recommendation scenarios. Code and checkpoints available at https://github.com/YRYangang/RRec.