MLPlatt: Simple Calibration Framework for Ranking Models
This addresses calibration issues for e-commerce platforms, enabling better interpretability and downstream task usability, but it is incremental as it builds on existing post-hoc calibration approaches.
The paper tackles the problem of poor interpretability and lack of scale calibration in ranking models used in e-commerce by introducing MLPlatt, a post-hoc calibration method that converts ranker outputs to click-through rate probabilities while preserving item ordering, achieving over 10% improvement in F-ECE on two datasets.
Ranking models are extensively used in e-commerce for relevance estimation. These models often suffer from poor interpretability and no scale calibration, particularly when trained with typical ranking loss functions. This paper addresses the problem of post-hoc calibration of ranking models. We introduce MLPlatt: a simple yet effective ranking model calibration method that preserves the item ordering and converts ranker outputs to interpretable click-through rate (CTR) probabilities usable in downstream tasks. The method is context-aware by design and achieves good calibration metrics globally, and within strata corresponding to different values of a selected categorical field (such as user country or device), which is often important from a business perspective of an E-commerce platform. We demonstrate the superiority of MLPlatt over existing approaches on two datasets, achieving an improvement of over 10\% in F-ECE (Field Expected Calibration Error) compared to other methods. Most importantly, we show that high-quality calibration can be achieved without compromising the ranking quality.