CLJan 20, 2020

Multi-level Head-wise Match and Aggregation in Transformer for Textual Sequence Matching

Shuohang Wang, Yunshi Lan, Yi Tay, Jing Jiang, Jingjing Liu

arXiv:2001.07234v10.88 citations

Originality Incremental advance

AI Analysis

This addresses sequence matching in NLP, offering incremental improvements for tasks relying on pre-computed representations.

The paper tackles the problem of noise in simple sequence matching with Transformers by proposing a multi-level head-wise matching and aggregation approach, achieving new state-of-the-art performance on tasks like SNLI, MNLI, QQP, and SQuAD-binary.

Transformer has been successfully applied to many natural language processing tasks. However, for textual sequence matching, simple matching between the representation of a pair of sequences might bring in unnecessary noise. In this paper, we propose a new approach to sequence pair matching with Transformer, by learning head-wise matching representations on multiple levels. Experiments show that our proposed approach can achieve new state-of-the-art performance on multiple tasks that rely only on pre-computed sequence-vector-representation, such as SNLI, MNLI-match, MNLI-mismatch, QQP, and SQuAD-binary.

View on arXiv PDF

Similar