DC is all you need: describing ReLU from a signal processing standpoint
This provides a theoretical analysis of ReLU's spectral behavior for neural network researchers, but it is incremental as it builds on existing activation function studies.
The paper tackled the problem of understanding ReLU activation functions in the frequency domain, demonstrating that ReLU introduces higher frequency oscillations and a DC component, with experiments showing the DC component helps converge to weight configurations near initial random weights.
Non-linear activation functions are crucial in Convolutional Neural Networks. However, until now they have not been well described in the frequency domain. In this work, we study the spectral behavior of ReLU, a popular activation function. We use the ReLU's Taylor expansion to derive its frequency domain behavior. We demonstrate that ReLU introduces higher frequency oscillations in the signal and a constant DC component. Furthermore, we investigate the importance of this DC component, where we demonstrate that it helps the model extract meaningful features related to the input frequency content. We accompany our theoretical derivations with experiments and real-world examples. First, we numerically validate our frequency response model. Then we observe ReLU's spectral behavior on two example models and a real-world one. Finally, we experimentally investigate the role of the DC component introduced by ReLU in the CNN's representations. Our results indicate that the DC helps to converge to a weight configuration that is close to the initial random weights.