Dynamic Self-Attention : Computing Attention over Words Dynamically for Sentence Embedding
This work addresses sentence encoding for NLP tasks, offering an incremental improvement by adapting an existing method to a new domain.
The paper tackled the problem of sentence embedding by proposing Dynamic Self-Attention (DSA), a new self-attention mechanism that modifies dynamic routing from capsule networks for NLP, achieving state-of-the-art results on the SNLI dataset with the fewest parameters and competitive performance on the SST dataset.
In this paper, we propose Dynamic Self-Attention (DSA), a new self-attention mechanism for sentence embedding. We design DSA by modifying dynamic routing in capsule network (Sabouretal.,2017) for natural language processing. DSA attends to informative words with a dynamic weight vector. We achieve new state-of-the-art results among sentence encoding methods in Stanford Natural Language Inference (SNLI) dataset with the least number of parameters, while showing comparative results in Stanford Sentiment Treebank (SST) dataset.