CLAIAug 17, 2022

Understanding Long Documents with Different Position-Aware Attentions

CMU
arXiv:2208.08201v112 citationsh-index: 45
Originality Incremental advance
AI Analysis

This addresses computational and efficiency challenges in processing long multimodal documents for NLP applications, but appears incremental as it modifies attention within existing transformer architectures.

The paper tackles the problem of long document understanding by exploring different position-aware attention mechanisms with shortened context, and reports advantages based on various evaluation metrics.

Despite several successes in document understanding, the practical task for long document understanding is largely under-explored due to several challenges in computation and how to efficiently absorb long multimodal input. Most current transformer-based approaches only deal with short documents and employ solely textual information for attention due to its prohibitive computation and memory limit. To address those issues in long document understanding, we explore different approaches in handling 1D and new 2D position-aware attention with essentially shortened context. Experimental results show that our proposed models have advantages for this task based on various evaluation metrics. Furthermore, our model makes changes only to the attention and thus can be easily adapted to any transformer-based architecture.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes