TopEx: Topic-based Explanations for Model Comparison
This addresses the problem of overwhelming and incomparable explanations for researchers and practitioners in NLP, though it appears incremental as it builds on existing explanation methods.
The paper tackles the challenge of comparing language models by introducing TopEx, a topic-based explanation method that provides model-agnostic topics to identify similarities and differences between models like DistilRoBERTa and GPT-2 on NLP tasks.
Meaningfully comparing language models is challenging with current explanation methods. Current explanations are overwhelming for humans due to large vocabularies or incomparable across models. We present TopEx, an explanation method that enables a level playing field for comparing language models via model-agnostic topics. We demonstrate how TopEx can identify similarities and differences between DistilRoBERTa and GPT-2 on a variety of NLP tasks.