IR AI LGOct 11, 2024

Federated Vision-Language-Recommendation with Personalized Fusion

Zhiwei Li, Guodong Long, Jing Jiang, Chengqi Zhang, Qiang Yang

arXiv:2410.08478v48.13 citationsh-index: 36

Originality Incremental advance

AI Analysis

This work addresses the need for personalized and privacy-preserving recommendation systems for users in on-device applications, representing an incremental advancement in federated learning for vision-language-recommendation.

The paper tackled the problem of applying vision-language models to recommendation in a federated learning setting to enhance user privacy and personalization, introducing FedVLR with a bi-level fusion mechanism that achieved validation on seven benchmark datasets.

Applying large pre-trained Vision-Language Models to recommendation is a burgeoning field, a direction we term Vision-Language-Recommendation (VLR). Bringing VLR to user-oriented on-device intelligence within a federated learning framework is a crucial step for enhancing user privacy and delivering personalized experiences. This paper introduces FedVLR, a federated VLR framework specially designed for user-specific personalized fusion of vision-language representations. At its core is a novel bi-level fusion mechanism: The server-side multi-view fusion module first generates a diverse set of pre-fused multimodal views. Subsequently, each client employs a user-specific mixture-of-expert mechanism to adaptively integrate these views based on individual user interaction history. This designed lightweight personalized fusion module provides an efficient solution to implement a federated VLR system. The effectiveness of our proposed FedVLR has been validated on seven benchmark datasets.

View on arXiv PDF

Similar