NeuroSurgeon: A Toolkit for Subnetwork Analysis
This provides a tool for researchers in explainable AI to analyze functional circuits in neural networks, but it is incremental as it builds on existing decomposition methods.
The authors tackled the problem of understanding neural networks by developing NeuroSurgeon, a toolkit for discovering and manipulating subnetworks in Huggingface Transformers models, resulting in a freely available Python library.
Despite recent advances in the field of explainability, much remains unknown about the algorithms that neural networks learn to represent. Recent work has attempted to understand trained models by decomposing them into functional circuits (Csordás et al., 2020; Lepori et al., 2023). To advance this research, we developed NeuroSurgeon, a python library that can be used to discover and manipulate subnetworks within models in the Huggingface Transformers library (Wolf et al., 2019). NeuroSurgeon is freely available at https://github.com/mlepori1/NeuroSurgeon.