NEFLFeb 2, 2021

Stronger Separation of Analog Neuron Hierarchy by Deterministic Context-Free Languages

arXiv:2102.01633v13 citations
Originality Highly original
AI Analysis

This research provides a deeper understanding of the computational capabilities of recurrent neural networks for researchers in theoretical computer science and neural network theory, particularly concerning their ability to recognize formal languages.

This paper investigates the computational power of recurrent neural networks (NNs) with saturated-linear activation functions, specifically focusing on the separation of the analog neuron hierarchy. It demonstrates that 1ANNs (binary-state NNs with one analog neuron) cannot recognize any non-regular deterministic context-free language (DCFL), while 2ANNs (with two analog neurons) can recognize all DCFLs, thus strengthening the separation between these two classes.

We analyze the computational power of discrete-time recurrent neural networks (NNs) with the saturated-linear activation function within the Chomsky hierarchy. This model restricted to integer weights coincides with binary-state NNs with the Heaviside activation function, which are equivalent to finite automata (Chomsky level 3) recognizing regular languages (REG), while rational weights make this model Turing-complete even for three analog-state units (Chomsky level 0). For the intermediate model $α$ANN of a binary-state NN that is extended with $α\geq 0$ extra analog-state neurons with rational weights, we have established the analog neuron hierarchy 0ANNs $\subset$ 1ANNs $\subset$ 2ANNs $\subseteq$ 3ANNs. The separation 1ANNs $\subsetneqq$ 2ANNs has been witnessed by the non-regular deterministic context-free language (DCFL) $L_\#=\{0^n1^n\mid n\geq 1\}$ which cannot be recognized by any 1ANN even with real weights, while any DCFL (Chomsky level 2) is accepted by a 2ANN with rational weights. In this paper, we strengthen this separation by showing that any non-regular DCFL cannot be recognized by 1ANNs with real weights, which means (DCFLs $\setminus$ REG) $\subset$ (2ANNs $\setminus$ 1ANNs), implying 1ANNs $\cap$ DCFLs = 0ANNs. For this purpose, we have shown that $L_\#$ is the simplest non-regular DCFL by reducing $L_\#$ to any language in this class, which is by itself an interesting achievement in computability theory.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes