CLAug 30, 2021

Neuron-level Interpretation of Deep NLP Models: A Survey

Hassan Sajjad, Nadir Durrani, Fahim Dalvi

arXiv:2108.13138v224.5319 citations

Originality Synthesis-oriented

AI Analysis

It addresses the need for granular interpretability in deep learning for NLP researchers and practitioners, but is incremental as it synthesizes existing work rather than introducing new methods.

This paper surveys recent research on neuron-level interpretation methods for deep NLP models, covering techniques for discovering and understanding neurons, evaluation approaches, key findings, and applications like model control and domain adaptation.

The proliferation of deep neural networks in various domains has seen an increased need for interpretability of these models. Preliminary work done along this line and papers that surveyed such, are focused on high-level representation analysis. However, a recent branch of work has concentrated on interpretability at a more granular level of analyzing neurons within these models. In this paper, we survey the work done on neuron analysis including: i) methods to discover and understand neurons in a network, ii) evaluation methods, iii) major findings including cross architectural comparisons that neuron analysis has unraveled, iv) applications of neuron probing such as: controlling the model, domain adaptation etc., and v) a discussion on open issues and future research directions.

View on arXiv PDF

Similar