CL AIAug 17, 2022

Understanding Long Documents with Different Position-Aware Attentions

Hai Pham, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang

CMU

arXiv:2208.08201v12.112 citationsh-index: 45

Originality Incremental advance

AI Analysis

This addresses computational and efficiency challenges in processing long multimodal documents for NLP applications, but appears incremental as it modifies attention within existing transformer architectures.

The paper tackles the problem of long document understanding by exploring different position-aware attention mechanisms with shortened context, and reports advantages based on various evaluation metrics.

Despite several successes in document understanding, the practical task for long document understanding is largely under-explored due to several challenges in computation and how to efficiently absorb long multimodal input. Most current transformer-based approaches only deal with short documents and employ solely textual information for attention due to its prohibitive computation and memory limit. To address those issues in long document understanding, we explore different approaches in handling 1D and new 2D position-aware attention with essentially shortened context. Experimental results show that our proposed models have advantages for this task based on various evaluation metrics. Furthermore, our model makes changes only to the attention and thus can be easily adapted to any transformer-based architecture.

View on arXiv PDF

Similar