Property Neurons in Self-Supervised Speech Transformers
This work addresses the need for precise model analysis and efficient pruning in speech processing, offering a method for model editing and compression, though it is incremental in building on existing layer-wise analysis approaches.
The authors tackled the problem of identifying specific neurons responsible for speech properties like phones, gender, and pitch in self-supervised speech Transformers, showing that removing these neurons degrades downstream performance and that protecting them during pruning improves effectiveness over norm-based methods.
There have been many studies on analyzing self-supervised speech Transformers, in particular, with layer-wise analysis. It is, however, desirable to have an approach that can pinpoint exactly a subset of neurons that is responsible for a particular property of speech, being amenable to model pruning and model editing. In this work, we identify a set of property neurons in the feedforward layers of Transformers to study how speech-related properties, such as phones, gender, and pitch, are stored. When removing neurons of a particular property (a simple form of model editing), the respective downstream performance significantly degrades, showing the importance of the property neurons. We apply this approach to pruning the feedforward layers in Transformers, where most of the model parameters are. We show that protecting property neurons during pruning is significantly more effective than norm-based pruning. The code for identifying property neurons is available at https://github.com/nervjack2/PropertyNeurons.