Calibrating the Predictions for Top-N Recommendations
This addresses the need for accurate user preference predictions in recommender systems, but it is incremental as it builds on existing calibration methods by focusing on top-N items.
The paper tackled the problem of miscalibrated predictions for top-N recommended items in recommender systems, showing that previous methods fail despite good overall calibration, and proposed a rank-grouped optimization method that improved calibration metrics across diverse datasets and models.
Well-calibrated predictions of user preferences are essential for many applications. Since recommender systems typically select the top-N items for users, calibration for those top-N items, rather than for all items, is important. We show that previous calibration methods result in miscalibrated predictions for the top-N items, despite their excellent calibration performance when evaluated on all items. In this work, we address the miscalibration in the top-N recommended items. We first define evaluation metrics for this objective and then propose a generic method to optimize calibration models focusing on the top-N items. It groups the top-N items by their ranks and optimizes distinct calibration models for each group with rank-dependent training weights. We verify the effectiveness of the proposed method for both explicit and implicit feedback datasets, using diverse classes of recommender models.