CL LGMay 31, 2023

Efficient Shapley Values Estimation by Amortization for Text Classification

Chenghao Yang, Fan Yin, He He, Kai-Wei Chang, Xiaofei Ma, Bing Xiang

arXiv:2305.19998v126.3222 citationsHas Code

Originality Highly original

AI Analysis

This addresses the trade-off between stability and efficiency in model interpretability for text classification, offering a practical solution for researchers and practitioners using large pretrained models.

The paper tackles the computational inefficiency and instability of estimating Shapley Values for explaining neural text classification models by developing an amortized model that predicts these values directly, achieving up to 60 times speedup while maintaining accuracy and stability.

Despite the popularity of Shapley Values in explaining neural text classification models, computing them is prohibitive for large pretrained models due to a large number of model evaluations. In practice, Shapley Values are often estimated with a small number of stochastic model evaluations. However, we show that the estimated Shapley Values are sensitive to random seed choices -- the top-ranked features often have little overlap across different seeds, especially on examples with longer input texts. This can only be mitigated by aggregating thousands of model evaluations, which on the other hand, induces substantial computational overheads. To mitigate the trade-off between stability and efficiency, we develop an amortized model that directly predicts each input feature's Shapley Value without additional model evaluations. It is trained on a set of examples whose Shapley Values are estimated from a large number of model evaluations to ensure stability. Experimental results on two text classification datasets demonstrate that our amortized model estimates Shapley Values accurately with up to 60 times speedup compared to traditional methods. Furthermore, the estimated values are stable as the inference is deterministic. We release our code at https://github.com/yangalan123/Amortized-Interpretability.

View on arXiv PDF Code

Similar