CLLGJun 16, 2015

Tree-structured composition in neural networks without tree-structured architectures

arXiv:1506.04834v376 citations
Originality Synthesis-oriented
AI Analysis

This addresses the problem of understanding compositional learning in neural networks for researchers in natural language processing, though it is incremental as it confirms existing limitations.

The paper investigated whether neural sequence models like LSTMs can implicitly learn recursive compositional structure, using an artificial data task where such structure is crucial, and found that LSTMs can exploit it but consistently underperform tree-structured models.

Tree-structured neural networks encode a particular tree geometry for a sentence in the network design. However, these models have at best only slightly outperformed simpler sequence-based models. We hypothesize that neural sequence models like LSTMs are in fact able to discover and implicitly use recursive compositional structure, at least for tasks with clear cues to that structure in the data. We demonstrate this possibility using an artificial data task for which recursive compositional structure is crucial, and find an LSTM-based sequence model can indeed learn to exploit the underlying tree structure. However, its performance consistently lags behind that of tree models, even on large training sets, suggesting that tree-structured models are more effective at exploiting recursive structure.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes