LGSPJul 23, 2024

DC is all you need: describing ReLU from a signal processing standpoint

arXiv:2407.16556v25 citationsh-index: 5
Originality Synthesis-oriented
AI Analysis

This provides a theoretical analysis of ReLU's spectral behavior for neural network researchers, but it is incremental as it builds on existing activation function studies.

The paper tackled the problem of understanding ReLU activation functions in the frequency domain, demonstrating that ReLU introduces higher frequency oscillations and a DC component, with experiments showing the DC component helps converge to weight configurations near initial random weights.

Non-linear activation functions are crucial in Convolutional Neural Networks. However, until now they have not been well described in the frequency domain. In this work, we study the spectral behavior of ReLU, a popular activation function. We use the ReLU's Taylor expansion to derive its frequency domain behavior. We demonstrate that ReLU introduces higher frequency oscillations in the signal and a constant DC component. Furthermore, we investigate the importance of this DC component, where we demonstrate that it helps the model extract meaningful features related to the input frequency content. We accompany our theoretical derivations with experiments and real-world examples. First, we numerically validate our frequency response model. Then we observe ReLU's spectral behavior on two example models and a real-world one. Finally, we experimentally investigate the role of the DC component introduced by ReLU in the CNN's representations. Our results indicate that the DC helps to converge to a weight configuration that is close to the initial random weights.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes