Valid Explanations for Learning to Rank Models
This addresses the need for transparency in ranking systems used in domains like information retrieval, but it is incremental as it builds on existing explanation methods with new validity notions.
The paper tackles the problem of interpreting decisions in learning-to-rank models by proposing a model-agnostic local explanation method that identifies a small subset of input features as explanations, and it shows that this approach outperforms other methods in validity without compromising completeness across various LTR models.
Learning-to-rank (LTR) is a class of supervised learning techniques that apply to ranking problems dealing with a large number of features. The popularity and widespread application of LTR models in prioritizing information in a variety of domains makes their scrutability vital in today's landscape of fair and transparent learning systems. However, limited work exists that deals with interpreting the decisions of learning systems that output rankings. In this paper we propose a model agnostic local explanation method that seeks to identify a small subset of input features as explanation to a ranking decision. We introduce new notions of validity and completeness of explanations specifically for rankings, based on the presence or absence of selected features, as a way of measuring goodness. We devise a novel optimization problem to maximize validity directly and propose greedy algorithms as solutions. In extensive quantitative experiments we show that our approach outperforms other model agnostic explanation approaches across pointwise, pairwise and listwise LTR models in validity while not compromising on completeness.