Enhanced Recurrent Neural Tangent Kernels for Non-Time-Series Data
This work provides an incremental improvement to the theoretical understanding and practical application of neural tangent kernels for machine learning researchers, specifically by extending RNN kernels to more complex architectures and demonstrating their utility on non-time-series data.
This paper extends recurrent neural network (RNN) kernels to include bidirectional RNNs and RNNs with average pooling, and provides a fast GPU implementation. Classifiers using these enhanced RNN-based kernels demonstrated superior performance against baseline methods across 90 non-time-series datasets from the UCI repository.
Kernels derived from deep neural networks (DNNs) in the infinite-width regime provide not only high performance in a range of machine learning tasks but also new theoretical insights into DNN training dynamics and generalization. In this paper, we extend the family of kernels associated with recurrent neural networks (RNNs), which were previously derived only for simple RNNs, to more complex architectures including bidirectional RNNs and RNNs with average pooling. We also develop a fast GPU implementation to exploit the full practical potential of the kernels. Though RNNs are typically only applied to time-series data, we demonstrate that classifiers using RNN-based kernels outperform a range of baseline methods on 90 non-time-series datasets from the UCI data repository.