IRCLLGMLApr 11, 2018

Word2Vec applied to Recommendation: Hyperparameters Matter

arXiv:1804.04212v3168 citations
Originality Synthesis-oriented
AI Analysis

This work addresses a practical issue for recommendation system developers by showing that using NLP-tuned hyperparameters leads to significant performance losses, though it is incremental as it focuses on hyperparameter optimization rather than a new method.

The paper tackled the problem of suboptimal hyperparameter tuning in recommendation systems using Word2Vec, finding that optimizing neglected hyperparameters like negative sampling distribution and window-size can improve performance by an order of magnitude.

Skip-gram with negative sampling, a popular variant of Word2vec originally designed and tuned to create word embeddings for Natural Language Processing, has been used to create item embeddings with successful applications in recommendation. While these fields do not share the same type of data, neither evaluate on the same tasks, recommendation applications tend to use the same already tuned hyperparameters values, even if optimal hyperparameters values are often known to be data and task dependent. We thus investigate the marginal importance of each hyperparameter in a recommendation setting through large hyperparameter grid searches on various datasets. Results reveal that optimizing neglected hyperparameters, namely negative sampling distribution, number of epochs, subsampling parameter and window-size, significantly improves performance on a recommendation task, and can increase it by an order of magnitude. Importantly, we find that optimal hyperparameters configurations for Natural Language Processing tasks and Recommendation tasks are noticeably different.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes