CLAIAug 17, 2015

Molding CNNs for text: non-linear, non-consecutive convolutions

arXiv:1508.04112v2147 citations
Originality Incremental advance
AI Analysis

This work addresses improving text classification efficiency and accuracy for NLP applications, representing an incremental advancement in CNN architectures.

The paper tackled the problem of adapting temporal convolutions in CNNs for text processing by using low-rank n-gram tensors and non-consecutive word patterns, achieving state-of-the-art performance with 51.2% accuracy on fine-grained sentiment classification.

The success of deep learning often derives from well-chosen operational building blocks. In this work, we revise the temporal convolution operation in CNNs to better adapt it to text processing. Instead of concatenating word representations, we appeal to tensor algebra and use low-rank n-gram tensors to directly exploit interactions between words already at the convolution stage. Moreover, we extend the n-gram convolution to non-consecutive words to recognize patterns with intervening words. Through a combination of low-rank tensors, and pattern weighting, we can efficiently evaluate the resulting convolution operation via dynamic programming. We test the resulting architecture on standard sentiment classification and news categorization tasks. Our model achieves state-of-the-art performance both in terms of accuracy and training speed. For instance, we obtain 51.2% accuracy on the fine-grained sentiment classification task.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes