A Missing Information Loss function for implicit feedback datasets
This addresses a common bias in recommender systems for large product catalogues, though it is an incremental improvement over existing methods.
The paper tackles the problem of latent factor models for recommender systems with implicit feedback incorrectly treating missing user-item interactions as negative feedback, which leads to zero preference recommendations for most items. The proposed Missing Information Loss (MIL) function forbids this treatment, achieving competitive ranking performance while reducing popular item recommendations by up to 20% and increasing long-tail recommendations by up to 50%.
Latent factor models for Recommender Systems with implicit feedback typically treat unobserved user-item interactions (i.e. missing information) as negative feedback. This is frequently done either through negative sampling (point--wise loss) or with a ranking loss function (pair-- or list--wise estimation). Since a zero preference recommendation is a valid solution for most common objective functions, regarding unknown values as actual zeros results in users having a zero preference recommendation for most of the available items. In this paper we propose a novel objective function, the \emph{Missing Information Loss} (MIL), that explicitly forbids treating unobserved user-item interactions as positive or negative feedback. We apply this loss to both traditional Matrix Factorization and user--based Denoising Autoencoder, and compare it with other established objective functions such as cross-entropy (both point- and pair-wise) or the recently proposed multinomial log-likelihood. MIL achieves competitive performance in ranking-aware metrics when applied to three datasets. Furthermore, we show that such a relevance in the recommendation is obtained while displaying popular items less frequently (up to a $20 \%$ decrease with respect to the best competing method). This debiasing from the recommendation of popular items favours the appearance of infrequent items (up to a $50 \%$ increase of long-tail recommendations), a valuable feature for Recommender Systems with a large catalogue of products.