XAM: Interactive Explainability for Authorship Attribution Models
This work addresses the need for better interpretability in authorship attribution for researchers and practitioners, though it is incremental as it builds on existing embedding-based models.
The authors tackled the problem of explaining authorship attribution models by developing an interactive framework that allows users to explore embedding spaces and construct multi-granularity style feature explanations, demonstrating its value through a user evaluation compared to predefined explanations.
We present IXAM, an Interactive eXplainability framework for Authorship Attribution Models. Given an authorship attribution (AA) task and an embedding-based AA model, our tool enables users to interactively explore the model's embedding space and construct an explanation of the model's prediction as a set of writing style features at different levels of granularity. Through a user evaluation, we demonstrate the value of our framework compared to predefined stylistic explanations.