Stability of Internal States in Recurrent Neural Networks Trained on Regular Languages
This provides insights into the interpretability and robustness of RNNs for language processing tasks, though it is incremental as it builds on existing analogies between RNNs and finite automata.
The study investigated how recurrent neural networks trained on regular languages maintain stable internal states when subjected to noise, finding that neurons saturate to form discrete clusters resembling finite state machines, with deterministic transitions and recovery from perturbations for arbitrarily long strings.
We provide an empirical study of the stability of recurrent neural networks trained to recognize regular languages. When a small amount of noise is introduced into the activation function, the neurons in the recurrent layer tend to saturate in order to compensate the variability. In this saturated regime, analysis of the network activation shows a set of clusters that resemble discrete states in a finite state machine. We show that transitions between these states in response to input symbols are deterministic and stable. The networks display a stable behavior for arbitrarily long strings, and when random perturbations are applied to any of the states, they are able to recover and their evolution converges to the original clusters. This observation reinforces the interpretation of the networks as finite automata, with neurons or groups of neurons coding specific and meaningful input patterns.