Break It Down: Evidence for Structural Compositionality in Neural Networks
This addresses the fundamental question of how neural networks function, with implications for interpretability and AI design, though it is incremental in exploring existing architectures.
The paper investigates whether neural networks implement structural compositionality by breaking tasks into modular subroutines, using model pruning across vision and language tasks to show that models often use modular subnetworks that can be ablated without affecting others.
Though modern neural networks have achieved impressive performance in both vision and language tasks, we know little about the functions that they implement. One possibility is that neural networks implicitly break down complex tasks into subroutines, implement modular solutions to these subroutines, and compose them into an overall solution to a task - a property we term structural compositionality. Another possibility is that they may simply learn to match new inputs to learned templates, eliding task decomposition entirely. Here, we leverage model pruning techniques to investigate this question in both vision and language across a variety of architectures, tasks, and pretraining regimens. Our results demonstrate that models often implement solutions to subroutines via modular subnetworks, which can be ablated while maintaining the functionality of other subnetworks. This suggests that neural networks may be able to learn compositionality, obviating the need for specialized symbolic mechanisms.