On a Family of Decomposable Kernels on Sequences
This work addresses sequence similarity for applications with ordered data, but it appears incremental as it builds on existing kernel methods.
The paper tackles the problem of measuring similarity between sequences of different lengths by introducing a family of Mercer kernel functions with a decomposable structure, and it shows competitive performance compared to the state-of-the-art Global Alignment kernel in sequential classification tasks.
In many applications data is naturally presented in terms of orderings of some basic elements or symbols. Reasoning about such data requires a notion of similarity capable of handling sequences of different lengths. In this paper we describe a family of Mercer kernel functions for such sequentially structured data. The family is characterized by a decomposable structure in terms of symbol-level and structure-level similarities, representing a specific combination of kernels which allows for efficient computation. We provide an experimental evaluation on sequential classification tasks comparing kernels from our family of kernels to a state of the art sequence kernel called the Global Alignment kernel which has been shown to outperform Dynamic Time Warping