IRMar 5, 2021

Non-invasive Self-attention for Side Information Fusion in Sequential Recommendation

arXiv:2103.03578v1177 citations
Originality Incremental advance
AI Analysis

This addresses a bottleneck in sequential recommendation for improving accuracy by better leveraging side information, though it is incremental as it builds on existing BERT-based methods.

The paper tackles the problem of effectively incorporating side information (e.g., item category or tag) into sequential recommendation systems under the BERT framework, where naive fusion methods often fail. The proposed NOVA mechanism stably outperforms state-of-the-art models with negligible computational overhead on public and commercial datasets.

Sequential recommender systems aim to model users' evolving interests from their historical behaviors, and hence make customized time-relevant recommendations. Compared with traditional models, deep learning approaches such as CNN and RNN have achieved remarkable advancements in recommendation tasks. Recently, the BERT framework also emerges as a promising method, benefited from its self-attention mechanism in processing sequential data. However, one limitation of the original BERT framework is that it only considers one input source of the natural language tokens. It is still an open question to leverage various types of information under the BERT framework. Nonetheless, it is intuitively appealing to utilize other side information, such as item category or tag, for more comprehensive depictions and better recommendations. In our pilot experiments, we found naive approaches, which directly fuse types of side information into the item embeddings, usually bring very little or even negative effects. Therefore, in this paper, we propose the NOninVasive self-attention mechanism (NOVA) to leverage side information effectively under the BERT framework. NOVA makes use of side information to generate better attention distribution, rather than directly altering the item embedding, which may cause information overwhelming. We validate the NOVA-BERT model on both public and commercial datasets, and our method can stably outperform the state-of-the-art models with negligible computational overheads.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes