MLLGNEOct 9, 2019

Kernel-Based Approaches for Sequence Modeling: Connections to Neural Methods

arXiv:1910.04233v11 citations
Originality Synthesis-oriented
AI Analysis

This work provides a new theoretical perspective on sequence modeling for machine learning researchers, but it is incremental as it reinterprets existing methods rather than introducing entirely new ones.

The paper tackled the problem of modeling time-dependent data by deriving neural network architectures like LSTM and CNN from kernel-based approaches, showing that these variants perform comparably or better than traditional neural methods, with significant improvements in a neuroscience application.

We investigate time-dependent data analysis from the perspective of recurrent kernel machines, from which models with hidden units and gated memory cells arise naturally. By considering dynamic gating of the memory cell, a model closely related to the long short-term memory (LSTM) recurrent neural network is derived. Extending this setup to $n$-gram filters, the convolutional neural network (CNN), Gated CNN, and recurrent additive network (RAN) are also recovered as special cases. Our analysis provides a new perspective on the LSTM, while also extending it to $n$-gram convolutional filters. Experiments are performed on natural language processing tasks and on analysis of local field potentials (neuroscience). We demonstrate that the variants we derive from kernels perform on par or even better than traditional neural methods. For the neuroscience application, the new models demonstrate significant improvements relative to the prior state of the art.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes