IR AIDec 5, 2024

Pre-train, Align, and Disentangle: Empowering Sequential Recommendation with Large Language Models

Yuhao Wang, Junwei Pan, Pengyue Jia, Wanyu Wang, Maolin Wang, Zhixiang Feng, Xiaotian Li, Jie Jiang, Xiangyu Zhao

arXiv:2412.04107v29.29 citationsh-index: 16Has CodeSIGIR

Originality Incremental advance

AI Analysis

This work addresses the problem of improving sequential recommendation systems for users and platforms by mitigating cold-start issues, though it is incremental as it builds on existing methods with a novel integration approach.

The paper tackles the challenges of sequential recommendation, such as cold-start and sub-optimal performance, by introducing a Pre-train, Align, and Disentangle (PAD) framework that integrates large language models, resulting in substantial enhancements validated on three public datasets, particularly for cold items.

Sequential Recommendation (SR) aims to leverage the sequential patterns in users' historical interactions to accurately track their preferences. However, the primary reliance of existing SR methods on collaborative data results in challenges such as the cold-start problem and sub-optimal performance. Concurrently, despite the proven effectiveness of large language models (LLMs), their integration into commercial recommender systems is impeded by issues such as high inference latency, incomplete capture of all distribution statistics, and catastrophic forgetting. To address these issues, we introduce a novel Pre-train, Align, and Disentangle (PAD) framework to enhance SR models with LLMs. In particular, we initially pre-train both the SR and LLM models to obtain collaborative and textual embeddings. Subsequently, we propose a characteristic recommendation-anchored alignment loss using multi-kernel maximum mean discrepancy with Gaussian kernels. Lastly, a triple-experts architecture, comprising aligned and modality-specific experts with disentangled embeddings, is fine-tuned in a frequency-aware manner. Experimental results on three public datasets validate the efficacy of PAD, indicating substantial enhancements and compatibility with various SR backbone models, particularly for cold items. The code and datasets are accessible for reproduction at https://github.com/Applied-Machine-Learning-Lab/PAD.

View on arXiv PDF Code

Similar