Sanity Simulations for Saliency Methods
This addresses the challenge of developing and adopting saliency methods for model interpretability in machine learning, though it is incremental as it provides a tool rather than a new method.
The authors tackled the problem of evaluating saliency methods by creating a synthetic benchmarking framework called SMERF, which enables ground-truth-based evaluation and reveals significant limitations in existing methods.
Saliency methods are a popular class of feature attribution explanation methods that aim to capture a model's predictive reasoning by identifying "important" pixels in an input image. However, the development and adoption of these methods are hindered by the lack of access to ground-truth model reasoning, which prevents accurate evaluation. In this work, we design a synthetic benchmarking framework, SMERF, that allows us to perform ground-truth-based evaluation while controlling the complexity of the model's reasoning. Experimentally, SMERF reveals significant limitations in existing saliency methods and, as a result, represents a useful tool for the development of new saliency methods.