LiRa: A New Likelihood-Based Similarity Score for Collaborative Filtering
This work addresses a key challenge in collaborative filtering for recommender systems, offering potential improvements in accuracy and scalability, though it appears incremental as it builds on existing similarity evaluation methods.
The paper tackled the problem of evaluating similarity between users in recommender systems with high missing data rates by introducing LiRa, a likelihood-based similarity score, which was shown to be more effective than traditional methods like Pearson correlation in identifying similar users.
Recommender system data presents unique challenges to the data mining, machine learning, and algorithms communities. The high missing data rate, in combination with the large scale and high dimensionality that is typical of recommender systems data, requires new tools and methods for efficient data analysis. Here, we address the challenge of evaluating similarity between two users in a recommender system, where for each user only a small set of ratings is available. We present a new similarity score, that we call LiRa, based on a statistical model of user similarity, for large-scale, discrete valued data with many missing values. We show that this score, based on a ratio of likelihoods, is more effective at identifying similar users than traditional similarity scores in user-based collaborative filtering, such as the Pearson correlation coefficient. We argue that our approach has significant potential to improve both accuracy and scalability in collaborative filtering.