LGJun 18, 2024

When Are Bias-Free ReLU Networks Effectively Linear Networks?

Yedi Zhang, Andrew Saxe, Peter E. Latham

arXiv:2406.12615v37.94 citations

Originality Incremental advance

AI Analysis

This work provides insights into neural network theory by revealing when bias-free ReLU networks reduce to linear models, which is incremental for understanding network behavior in specific settings.

The paper investigates the expressivity and learning dynamics of bias-free ReLU networks, showing that two-layer versions have limited expressivity and, under symmetry conditions, behave like linear networks, enabling analytical solutions outside lazy learning regimes.

We investigate the implications of removing bias in ReLU networks regarding their expressivity and learning dynamics. We first show that two-layer bias-free ReLU networks have limited expressivity: the only odd function two-layer bias-free ReLU networks can express is a linear one. We then show that, under symmetry conditions on the data, these networks have the same learning dynamics as linear networks. This enables us to give analytical time-course solutions to certain two-layer bias-free (leaky) ReLU networks outside the lazy learning regime. While deep bias-free ReLU networks are more expressive than their two-layer counterparts, they still share a number of similarities with deep linear networks. These similarities enable us to leverage insights from linear networks to understand certain ReLU networks. Overall, our results show that some properties previously established for bias-free ReLU networks arise due to equivalence to linear networks.

View on arXiv PDF

Similar