Ghosts in Neural Networks: Existence, Structure and Role of Infinite-Dimensional Null Space
This work addresses a foundational aspect of neural network theory for researchers in deep learning, but it is incremental as it builds on existing overparametrization studies without introducing a new paradigm.
The paper tackles the problem of understanding the null components (ghosts) in overparametrized neural networks, showing that any null element can be uniquely expressed as a linear combination of ridgelet transforms, which helps analyze the parameter landscape. As a result, it discusses the impact of these ghosts on generalization performance, though no concrete numerical results are provided.
Overparametrization has been remarkably successful for deep learning studies. This study investigates an overlooked but important aspect of overparametrized neural networks, that is, the null components in the parameters of neural networks, or the ghosts. Since deep learning is not explicitly regularized, typical deep learning solutions contain null components. In this paper, we present a structure theorem of the null space for a general class of neural networks. Specifically, we show that any null element can be uniquely written by the linear combination of ridgelet transforms. In general, it is quite difficult to fully characterize the null space of an arbitrarily given operator. Therefore, the structure theorem is a great advantage for understanding a complicated landscape of neural network parameters. As applications, we discuss the roles of ghosts on the generalization performance of deep learning.