The Symmetries of Three-Layer ReLU Networks
For deep learning theorists, this work provides a complete algebraic description of symmetries in a specific architecture, but it is incremental as it extends prior symmetry analyses to a slightly deeper setting.
This paper characterizes all parameter symmetries of three-layer ReLU networks with a bottleneck, providing a polynomial-time algorithm to test functional equivalence. It shows that some symmetries yield local conservation laws under gradient flow while others do not.
We develop a framework for analyzing parameter symmetries in deep ReLU networks and obtain a complete characterization of the generic parameter fibers for three-layer bottleneck architectures. Our approach provides explicit semi-algebraic descriptions of these fibers and yields a polynomial time algorithm for deciding functional equivalence of two parameters. The symmetries include discrete and continuous transformations arising from layer composition, and depend on whether deeper layers hide or preserve geometric structure from preceding layers. Finally, we show that some of these symmetries induce local conservation laws along gradient flow, while others do not.