CLAug 10, 2018

Hierarchical Attention: What Really Counts in Various NLP Tasks

arXiv:1808.03728v12 citations
Originality Highly original
AI Analysis

This addresses performance limitations in NLP tasks like text generation and reading comprehension, offering a novel approach to enhance attention mechanisms.

The paper tackles the bottleneck of attention mechanisms lacking hierarchical features in NLP tasks by proposing a Hierarchical Attention Mechanism (Ham), achieving a state-of-the-art BLEU score of 0.26 on Chinese poem generation and a 6.5% average improvement in machine reading comprehension.

Attention mechanisms in sequence to sequence models have shown great ability and wonderful performance in various natural language processing (NLP) tasks, such as sentence embedding, text generation, machine translation, machine reading comprehension, etc. Unfortunately, existing attention mechanisms only learn either high-level or low-level features. In this paper, we think that the lack of hierarchical mechanisms is a bottleneck in improving the performance of the attention mechanisms, and propose a novel Hierarchical Attention Mechanism (Ham) based on the weighted sum of different layers of a multi-level attention. Ham achieves a state-of-the-art BLEU score of 0.26 on Chinese poem generation task and a nearly 6.5% averaged improvement compared with the existing machine reading comprehension models such as BIDAF and Match-LSTM. Furthermore, our experiments and theorems reveal that Ham has greater generalization and representation ability than existing attention mechanisms.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes