CLLGAug 10, 2015

Learning Structural Kernels for Natural Language Processing

arXiv:1508.02131v121 citations
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in kernel-based NLP methods, offering an incremental improvement for researchers and practitioners in the field.

The paper tackles the problem of model selection for structural kernels in NLP, which is often overlooked, by proposing a Bayesian approach using Gaussian Processes, resulting in better prediction performance compared to grid search methods.

Structural kernels are a flexible learning paradigm that has been widely used in Natural Language Processing. However, the problem of model selection in kernel-based methods is usually overlooked. Previous approaches mostly rely on setting default values for kernel hyperparameters or using grid search, which is slow and coarse-grained. In contrast, Bayesian methods allow efficient model selection by maximizing the evidence on the training data through gradient-based methods. In this paper we show how to perform this in the context of structural kernels by using Gaussian Processes. Experimental results on tree kernels show that this procedure results in better prediction performance compared to hyperparameter optimization via grid search. The framework proposed in this paper can be adapted to other structures besides trees, e.g., strings and graphs, thereby extending the utility of kernel-based methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes