IRAIOct 14, 2025

Causal Inspired Multi Modal Recommendation

arXiv:2510.12325v1h-index: 1
Originality Highly original
AI Analysis

This work addresses critical biases in e-commerce and online advertising recommendations, offering an incremental improvement with strong specific gains.

The paper tackles biases in multimodal recommender systems, such as modal confounding and interaction bias, by proposing a causal-inspired framework that uses cross-modal diffusion and causal adjustments, achieving significant performance improvements over state-of-the-art baselines on three e-commerce datasets.

Multimodal recommender systems enhance personalized recommendations in e-commerce and online advertising by integrating visual, textual, and user-item interaction data. However, existing methods often overlook two critical biases: (i) modal confounding, where latent factors (e.g., brand style or product category) simultaneously drive multiple modalities and influence user preference, leading to spurious feature-preference associations; (ii) interaction bias, where genuine user preferences are mixed with noise from exposure effects and accidental clicks. To address these challenges, we propose a Causal-inspired multimodal Recommendation framework. Specifically, we introduce a dual-channel cross-modal diffusion module to identify hidden modal confounders, utilize back-door adjustment with hierarchical matching and vector-quantized codebooks to block confounding paths, and apply front-door adjustment combined with causal topology reconstruction to build a deconfounded causal subgraph. Extensive experiments on three real-world e-commerce datasets demonstrate that our method significantly outperforms state-of-the-art baselines while maintaining strong interpretability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes