Discovering the Compositional Structure of Vector Representations with Role Learning Networks
This addresses the interpretability gap in neural networks for compositional reasoning, providing a method to uncover implicit symbolic structures, though it is incremental in analyzing existing models.
The paper tackled the problem of how neural networks succeed at compositional tasks without explicit symbolic representations, showing that recurrent networks converge to solutions that implicitly encode symbolic structure, as verified by manipulating hidden embeddings to causally affect model outputs.
How can neural networks perform so well on compositional tasks even though they lack explicit compositional representations? We use a novel analysis technique called ROLE to show that recurrent neural networks perform well on such tasks by converging to solutions which implicitly represent symbolic structure. This method uncovers a symbolic structure which, when properly embedded in vector space, closely approximates the encodings of a standard seq2seq network trained to perform the compositional SCAN task. We verify the causal importance of the discovered symbolic structure by showing that, when we systematically manipulate hidden embeddings based on this symbolic structure, the model's output is changed in the way predicted by our analysis.