Contextualized Non-local Neural Networks for Sequence Learning
This work addresses sequence learning challenges in NLP, offering improved performance and interpretability, though it is incremental as it builds on existing methods like Transformers and GNNs.
The paper tackled the problem of sequence learning by combining self-attention and graph neural networks to dynamically construct task-specific sentence structures and leverage local dependencies, resulting in a model that outperformed competitive baselines on ten NLP tasks.
Recently, a large number of neural mechanisms and models have been proposed for sequence learning, of which self-attention, as exemplified by the Transformer model, and graph neural networks (GNNs) have attracted much attention. In this paper, we propose an approach that combines and draws on the complementary strengths of these two methods. Specifically, we propose contextualized non-local neural networks (CN$^{\textbf{3}}$), which can both dynamically construct a task-specific structure of a sentence and leverage rich local dependencies within a particular neighborhood. Experimental results on ten NLP tasks in text classification, semantic matching, and sequence labeling show that our proposed model outperforms competitive baselines and discovers task-specific dependency structures, thus providing better interpretability to users.