IR AIAug 23, 2023

LLMRec: Benchmarking Large Language Models on Recommendation Task

Junling Liu, Chao Liu, Peilin Zhou, Qichen Ye, Dading Chong, Kang Zhou, Yueqi Xie, Yuwei Cao, Shoujin Wang, Chenyu You, Philip S. Yu

arXiv:2308.12241v121.960 citationsh-index: 35Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of benchmarking LLMs for recommendation systems for researchers, but it is incremental as it applies existing methods to a new domain without introducing novel techniques.

The authors tackled the lack of thorough investigation of large language models (LLMs) in recommendation tasks by proposing LLMRec, a benchmark system that evaluates off-the-shelf LLMs on five tasks, finding moderate accuracy in some tasks but comparable performance to state-of-the-art methods in explainability-based tasks.

Recently, the fast development of Large Language Models (LLMs) such as ChatGPT has significantly advanced NLP tasks by enhancing the capabilities of conversational models. However, the application of LLMs in the recommendation domain has not been thoroughly investigated. To bridge this gap, we propose LLMRec, a LLM-based recommender system designed for benchmarking LLMs on various recommendation tasks. Specifically, we benchmark several popular off-the-shelf LLMs, such as ChatGPT, LLaMA, ChatGLM, on five recommendation tasks, including rating prediction, sequential recommendation, direct recommendation, explanation generation, and review summarization. Furthermore, we investigate the effectiveness of supervised finetuning to improve LLMs' instruction compliance ability. The benchmark results indicate that LLMs displayed only moderate proficiency in accuracy-based tasks such as sequential and direct recommendation. However, they demonstrated comparable performance to state-of-the-art methods in explainability-based tasks. We also conduct qualitative evaluations to further evaluate the quality of contents generated by different models, and the results show that LLMs can truly understand the provided information and generate clearer and more reasonable results. We aspire that this benchmark will serve as an inspiration for researchers to delve deeper into the potential of LLMs in enhancing recommendation performance. Our codes, processed data and benchmark results are available at https://github.com/williamliujl/LLMRec.

View on arXiv PDF Code

Similar