CLJun 2, 2021

Hi-Transformer: Hierarchical Interactive Transformer for Efficient and Effective Long Document Modeling

arXiv:2106.01040v3720 citations
Originality Incremental advance
AI Analysis

This addresses the computational bottleneck for researchers and practitioners handling long documents, but it is incremental as it builds on existing hierarchical and Transformer-based approaches.

The paper tackles the problem of Transformer's quadratic complexity with input length for long document modeling by proposing Hi-Transformer, a hierarchical interactive Transformer that reduces complexity and captures global context, achieving improved efficiency and effectiveness validated on three benchmark datasets.

Transformer is important for text modeling. However, it has difficulty in handling long documents due to the quadratic complexity with input text length. In order to handle this problem, we propose a hierarchical interactive Transformer (Hi-Transformer) for efficient and effective long document modeling. Hi-Transformer models documents in a hierarchical way, i.e., first learns sentence representations and then learns document representations. It can effectively reduce the complexity and meanwhile capture global document context in the modeling of each sentence. More specifically, we first use a sentence Transformer to learn the representations of each sentence. Then we use a document Transformer to model the global document context from these sentence representations. Next, we use another sentence Transformer to enhance sentence modeling using the global document context. Finally, we use hierarchical pooling method to obtain document embedding. Extensive experiments on three benchmark datasets validate the efficiency and effectiveness of Hi-Transformer in long document modeling.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes