CLAILGFeb 28, 2015

Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks

arXiv:1503.00075v33250 citations
Originality Highly original
AI Analysis

This work addresses the limitation of linear chain LSTMs in capturing syntactic structures for natural language processing tasks, offering a novel approach for researchers and practitioners in NLP.

The authors tackled the problem of modeling natural language syntax by generalizing LSTM networks to tree-structured topologies, resulting in Tree-LSTM outperforming existing systems and strong LSTM baselines on semantic relatedness prediction and sentiment classification tasks.

Because of their superior ability to preserve sequence information over time, Long Short-Term Memory (LSTM) networks, a type of recurrent neural network with a more complex computational unit, have obtained strong results on a variety of sequence modeling tasks. The only underlying LSTM structure that has been explored so far is a linear chain. However, natural language exhibits syntactic properties that would naturally combine words to phrases. We introduce the Tree-LSTM, a generalization of LSTMs to tree-structured network topologies. Tree-LSTMs outperform all existing systems and strong LSTM baselines on two tasks: predicting the semantic relatedness of two sentences (SemEval 2014, Task 1) and sentiment classification (Stanford Sentiment Treebank).

Code Implementations16 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes