IRAug 30, 2024
Bridging User Dynamics: Transforming Sequential Recommendations with Schrödinger Bridge and Diffusion ModelsWenjia Xie, Rui Zhou, Hao Wang et al.
Sequential recommendation has attracted increasing attention due to its ability to accurately capture the dynamic changes in user interests. We have noticed that generative models, especially diffusion models, which have achieved significant results in fields like image and audio, hold considerable promise in the field of sequential recommendation. However, existing sequential recommendation methods based on diffusion models are constrained by a prior distribution limited to Gaussian distribution, hindering the possibility of introducing user-specific information for each recommendation and leading to information loss. To address these issues, we introduce the Schrödinger Bridge into diffusion-based sequential recommendation models, creating the SdifRec model. This allows us to replace the Gaussian prior of the diffusion model with the user's current state, directly modeling the process from a user's current state to the target recommendation. Additionally, to better utilize collaborative information in recommendations, we propose an extended version of SdifRec called con-SdifRec, which utilizes user clustering information as a guiding condition to further enhance the posterior distribution. Finally, extensive experiments on multiple public benchmark datasets have demonstrated the effectiveness of SdifRec and con-SdifRec through comparison with several state-of-the-art methods. Further in-depth analysis has validated their efficiency and robustness.
LGJun 5, 2024Code
Exploring User Retrieval Integration towards Large Language Models for Cross-Domain Sequential RecommendationTingjia Shen, Hao Wang, Jiaqing Zhang et al.
Cross-Domain Sequential Recommendation (CDSR) aims to mine and transfer users' sequential preferences across different domains to alleviate the long-standing cold-start issue. Traditional CDSR models capture collaborative information through user and item modeling while overlooking valuable semantic information. Recently, Large Language Model (LLM) has demonstrated powerful semantic reasoning capabilities, motivating us to introduce them to better capture semantic information. However, introducing LLMs to CDSR is non-trivial due to two crucial issues: seamless information integration and domain-specific generation. To this end, we propose a novel framework named URLLM, which aims to improve the CDSR performance by exploring the User Retrieval approach and domain grounding on LLM simultaneously. Specifically, we first present a novel dual-graph sequential model to capture the diverse information, along with an alignment and contrastive learning method to facilitate domain knowledge transfer. Subsequently, a user retrieve-generation model is adopted to seamlessly integrate the structural information into LLM, fully harnessing its emergent inferencing ability. Furthermore, we propose a domain-specific strategy and a refinement module to prevent out-of-domain generation. Extensive experiments on Amazon demonstrated the information integration and domain-specific generation ability of URLLM in comparison to state-of-the-art baselines. Our code is available at https://github.com/TingJShen/URLLM
IRMay 31, 2023Code
A Survey on Large Language Models for RecommendationLikang Wu, Zhi Zheng, Zhaopeng Qiu et al.
Large Language Models (LLMs) have emerged as powerful tools in the field of Natural Language Processing (NLP) and have recently gained significant attention in the domain of Recommendation Systems (RS). These models, trained on massive amounts of data using self-supervised learning, have demonstrated remarkable success in learning universal representations and have the potential to enhance various aspects of recommendation systems by some effective transfer techniques such as fine-tuning and prompt tuning, and so on. The crucial aspect of harnessing the power of language models in enhancing recommendation quality is the utilization of their high-quality representations of textual features and their extensive coverage of external knowledge to establish correlations between items and users. To provide a comprehensive understanding of the existing LLM-based recommendation systems, this survey presents a taxonomy that categorizes these models into two major paradigms, respectively Discriminative LLM for Recommendation (DLLM4Rec) and Generative LLM for Recommendation (GLLM4Rec), with the latter being systematically sorted out for the first time. Furthermore, we systematically review and analyze existing LLM-based recommendation systems within each paradigm, providing insights into their methodologies, techniques, and performance. Additionally, we identify key challenges and several valuable findings to provide researchers and practitioners with inspiration. We have also created a GitHub repository to index relevant papers on LLMs for recommendation, https://github.com/WLiK/LLM4Rec.
AINov 30, 2024
Optimizing Sequential Recommendation Models with Scaling Laws and Approximate EntropyTingjia Shen, Hao Wang, Chuhan Wu et al.
Scaling Laws have emerged as a powerful framework for understanding how model performance evolves as they increase in size, providing valuable insights for optimizing computational resources. In the realm of Sequential Recommendation (SR), which is pivotal for predicting users' sequential preferences, these laws offer a lens through which to address the challenges posed by the scalability of SR models. However, the presence of structural and collaborative issues in recommender systems prevents the direct application of the Scaling Law (SL) in these systems. In response, we introduce the Performance Law for SR models, which aims to theoretically investigate and model the relationship between model performance and data quality. Specifically, we first fit the HR and NDCG metrics to transformer-based SR models. Subsequently, we propose Approximate Entropy (ApEn) to assess data quality, presenting a more nuanced approach compared to traditional data quantity metrics. Our method enables accurate predictions across various dataset scales and model sizes, demonstrating a strong correlation in large SR models and offering insights into achieving optimal performance for any given model configuration.