IRAIFeb 8, 2023

Multimodal Recommender Systems: A Survey

arXiv:2302.03883v2159 citationsh-index: 18Has Code
Originality Synthesis-oriented
AI Analysis

It addresses the need for understanding multimodal content in recommendations to improve accuracy and alleviate data sparsity, primarily for researchers and practitioners in academia and industry, but it is incremental as a survey.

This paper provides a comprehensive survey of Multimodal Recommender Systems (MRS), summarizing general procedures, major challenges, existing models categorized by technical aspects, and resources like datasets and code.

The recommender system (RS) has been an integral toolkit of online services. They are equipped with various deep learning techniques to model user preference based on identifier and attribute information. With the emergence of multimedia services, such as short videos, news and etc., understanding these contents while recommending becomes critical. Besides, multimodal features are also helpful in alleviating the problem of data sparsity in RS. Thus, Multimodal Recommender System (MRS) has attracted much attention from both academia and industry recently. In this paper, we will give a comprehensive survey of the MRS models, mainly from technical views. First, we conclude the general procedures and major challenges for MRS. Then, we introduce the existing MRS models according to four categories, i.e., Modality Encoder, Feature Interaction, Feature Enhancement and Model Optimization. Besides, to make it convenient for those who want to research this field, we also summarize the dataset and code resources. Finally, we discuss some promising future directions of MRS and conclude this paper. To access more details of the surveyed papers, such as implementation code, we open source a repository.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes