IR LGMar 17, 2023

Explaining the Performance of Collaborative Filtering Methods With Optimal Data Characteristics

arXiv:2303.11172v1h-index: 17

Originality Incremental advance

AI Analysis

This work provides a simplified explanation for performance variations in recommender systems, which is incremental as it builds on prior studies by reducing the number of key characteristics from six to two.

The study found that the performance variation of collaborative filtering methods can be largely explained by just two rating data characteristics—information per user and information per item—rather than six or more, with performance showing a quadratic relationship to these characteristics for square user-item matrices, based on experiments with seven CF methods and three public datasets.

The performance of a Collaborative Filtering (CF) method is based on the properties of a User-Item Rating Matrix (URM). And the properties or Rating Data Characteristics (RDC) of a URM are constantly changing. Recent studies significantly explained the variation in the performances of CF methods resulted due to the change in URM using six or more RDC. Here, we found that the significant proportion of variation in the performances of different CF techniques can be accounted to two RDC only. The two RDC are the number of ratings per user or Information per User (IpU) and the number of ratings per item or Information per Item (IpI). And the performances of CF algorithms are quadratic to IpU (or IpI) for a square URM. The findings of this study are based on seven well-established CF methods and three popular public recommender datasets: 1M MovieLens, 25M MovieLens, and Yahoo! Music Rating datasets

View on arXiv PDF

Similar