An Exact Kernel Equivalence for Finite Classification Models
This foundational work offers a theoretical tool for analyzing neural network predictions and generalization, though it is incremental relative to existing kernel approximations like the NTK.
The paper tackles the problem of understanding neural networks by deriving the first exact kernel representation for any finite-size parametric classification model trained with gradient descent, showing it can be computed up to machine precision and providing insights into generalization.
We explore the equivalence between neural networks and kernel methods by deriving the first exact representation of any finite-size parametric classification model trained with gradient descent as a kernel machine. We compare our exact representation to the well-known Neural Tangent Kernel (NTK) and discuss approximation error relative to the NTK and other non-exact path kernel formulations. We experimentally demonstrate that the kernel can be computed for realistic networks up to machine precision. We use this exact kernel to show that our theoretical contribution can provide useful insights into the predictions made by neural networks, particularly the way in which they generalize.