IR AIFeb 21, 2025

Bridging Domain Gaps between Pretrained Multimodal Models and Recommendations

Wenyu Zhang, Jie Luo, Xinming Zhang, Yuan Fang

arXiv:2502.15542v1h-index: 9

Originality Incremental advance

AI Analysis

This addresses the challenge of efficiently adapting pre-trained models for personalized recommendation, which is incremental as it builds on existing parameter-efficient methods.

The paper tackles the performance degradation when applying pre-trained multimodal models to recommendation tasks due to domain gaps, proposing PTMRec, a parameter-efficient tuning framework that achieves competitive results without costly pre-training.

With the explosive growth of multimodal content online, pre-trained visual-language models have shown great potential for multimodal recommendation. However, while these models achieve decent performance when applied in a frozen manner, surprisingly, due to significant domain gaps (e.g., feature distribution discrepancy and task objective misalignment) between pre-training and personalized recommendation, adopting a joint training approach instead leads to performance worse than baseline. Existing approaches either rely on simple feature extraction or require computationally expensive full model fine-tuning, struggling to balance effectiveness and efficiency. To tackle these challenges, we propose \textbf{P}arameter-efficient \textbf{T}uning for \textbf{M}ultimodal \textbf{Rec}ommendation (\textbf{PTMRec}), a novel framework that bridges the domain gap between pre-trained models and recommendation systems through a knowledge-guided dual-stage parameter-efficient training strategy. This framework not only eliminates the need for costly additional pre-training but also flexibly accommodates various parameter-efficient tuning methods.

View on arXiv PDF

Similar