Advancements in Recommender Systems: A Comprehensive Analysis Based on Data, Algorithms, and Evaluation
This is an incremental review paper that synthesizes existing research to identify challenges and propose future directions for recommender systems researchers and practitioners.
This paper systematically reviews 286 research papers to identify current challenges in recommender systems across data, algorithms, and evaluation aspects, finding that issues like cold start, data sparsity, and offline data leakage have prominent impacts, and proposes solutions such as fusing physiological signals and fine-tuning pre-trained large models.
Using 286 research papers collected from Web of Science, ScienceDirect, SpringerLink, arXiv, and Google Scholar databases, a systematic review methodology was adopted to review and summarize the current challenges and potential future developments in data, algorithms, and evaluation aspects of RSs. It was found that RSs involve five major research topics, namely algorithmic improvement, domain applications, user behavior & cognition, data processing & modeling, and social impact & ethics. Collaborative filtering and hybrid recommendation techniques are mainstream. The performance of RSs is jointly limited by four types of eight data issues, two types of twelve algorithmic issues, and two evaluation issues. Notably, data-related issues such as cold start, data sparsity, and data poisoning, algorithmic issues like interest drift, device-cloud collaboration, non-causal driven, and multitask conflicts, along with evaluation issues such as offline data leakage and multi-objective balancing, have prominent impacts. Fusing physiological signals for multimodal modeling, defending against data poisoning through user information behavior, evaluating generative recommendations via social experiments, fine-tuning pre-trained large models to schedule device-cloud resource, enhancing causal inference with deep reinforcement learning, training multi-task models based on probability distributions, using cross-temporal dataset partitioning, and evaluating recommendation objectives across the full lifecycle are feasible solutions to address the aforementioned prominent challenges and unlock the power and value of RSs.The collected literature is mainly based on major international databases, and future research will further expand upon it.