CLLGMLMar 2, 2015

Bayesian Optimization of Text Representations

arXiv:1503.00693v145 citations
Originality Incremental advance
AI Analysis

This addresses the need for automated, black-box NLP systems that reduce manual tuning for researchers and practitioners.

The paper tackles the problem of optimizing text representation choices in NLP by formulating it as a global optimization problem, using Bayesian optimization to make standard linear models competitive with state-of-the-art methods on tasks like topic classification and sentiment analysis.

When applying machine learning to problems in NLP, there are many choices to make about how to represent input texts. These choices can have a big effect on performance, but they are often uninteresting to researchers or practitioners who simply need a module that performs well. We propose an approach to optimizing over this space of choices, formulating the problem as global optimization. We apply a sequential model-based optimization technique and show that our method makes standard linear models competitive with more sophisticated, expensive state-of-the-art methods based on latent variable models or neural networks on various topic classification and sentiment analysis problems. Our approach is a first step towards black-box NLP systems that work with raw text and do not require manual tuning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes