On The Expressivity of Recurrent Neural Cascades
This addresses a theoretical gap in understanding the trade-offs of acyclic recurrent networks for researchers in computational linguistics and neural network theory, though it is incremental as it builds on prior expressivity studies.
The paper investigates whether the architectural advantages of Recurrent Neural Cascades (RNCs) reduce their expressivity, showing that with specific activations and weights, RNCs capture star-free regular languages, and can achieve full regular language expressivity by implementing groups.
Recurrent Neural Cascades (RNCs) are the recurrent neural networks with no cyclic dependencies among recurrent neurons. This class of recurrent networks has received a lot of attention in practice. Besides training methods for a fixed architecture such as backpropagation, the cascade architecture naturally allows for constructive learning methods, where recurrent nodes are added incrementally one at a time, often yielding smaller networks. Furthermore, acyclicity amounts to a structural prior that even for the same number of neurons yields a more favourable sample complexity compared to a fully-connected architecture. A central question is whether the advantages of the cascade architecture come at the cost of a reduced expressivity. We provide new insights into this question. We show that the regular languages captured by RNCs with sign and tanh activation with positive recurrent weights are the star-free regular languages. In order to establish our results we developed a novel framework where capabilities of RNCs are accessed by analysing which semigroups and groups a single neuron is able to implement. A notable implication of our framework is that RNCs can achieve the expressivity of all regular languages by introducing neurons that can implement groups.