Aligning Large Language Models for Controllable Recommendations
This work addresses the need for more conversational, explainable, and controllable recommender systems, representing an incremental improvement over existing methods that focus on accuracy but neglect instruction-following.
The paper tackles the problem of aligning large language models (LLMs) for controllable recommendations by improving their ability to follow instructions, while maintaining high accuracy. It introduces supervised learning tasks and a reinforcement learning alignment procedure, achieving marked advancements in instruction compliance on two real-world datasets.
Inspired by the exceptional general intelligence of Large Language Models (LLMs), researchers have begun to explore their application in pioneering the next generation of recommender systems - systems that are conversational, explainable, and controllable. However, existing literature primarily concentrates on integrating domain-specific knowledge into LLMs to enhance accuracy, often neglecting the ability to follow instructions. To address this gap, we initially introduce a collection of supervised learning tasks, augmented with labels derived from a conventional recommender model, aimed at explicitly improving LLMs' proficiency in adhering to recommendation-specific instructions. Subsequently, we develop a reinforcement learning-based alignment procedure to further strengthen LLMs' aptitude in responding to users' intentions and mitigating formatting errors. Through extensive experiments on two real-world datasets, our method markedly advances the capability of LLMs to comply with instructions within recommender systems, while sustaining a high level of accuracy performance.