CLApr 8, 2014

A Convolutional Neural Network for Modelling Sentences

Nal Kalchbrenner, Edward Grefenstette, Phil Blunsom

arXiv:1404.2188v13643 citations

AI Analysis

This work addresses the need for accurate sentence modeling in natural language processing, offering a method applicable to any language without relying on parse trees, but it is incremental as it builds on existing convolutional architectures.

The authors tackled the problem of sentence representation for language understanding by introducing the Dynamic Convolutional Neural Network (DCNN), which achieved excellent performance in sentiment prediction and question classification tasks, including a greater than 25% error reduction in Twitter sentiment prediction compared to the strongest baseline.

The ability to accurately represent sentences is central to language understanding. We describe a convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) that we adopt for the semantic modelling of sentences. The network uses Dynamic k-Max Pooling, a global pooling operation over linear sequences. The network handles input sentences of varying length and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations. The network does not rely on a parse tree and is easily applicable to any language. We test the DCNN in four experiments: small scale binary and multi-class sentiment prediction, six-way question classification and Twitter sentiment prediction by distant supervision. The network achieves excellent performance in the first three tasks and a greater than 25% error reduction in the last task with respect to the strongest baseline.

View on arXiv PDF

Similar