Exponential Separations in Symmetric Neural Networks
This provides theoretical insight into architectural limitations for researchers in neural network theory and symmetric function learning.
The paper demonstrates an exponential separation between two symmetric neural network architectures (Relational Networks and DeepSets) by constructing a symmetric function that can be efficiently approximated by Relational Networks but requires exponential width in DeepSets.
In this work we demonstrate a novel separation between symmetric neural network architectures. Specifically, we consider the Relational Network~\parencite{santoro2017simple} architecture as a natural generalization of the DeepSets~\parencite{zaheer2017deep} architecture, and study their representational gap. Under the restriction to analytic activation functions, we construct a symmetric function acting on sets of size $N$ with elements in dimension $D$, which can be efficiently approximated by the former architecture, but provably requires width exponential in $N$ and $D$ for the latter.