CLAINov 4, 2021

A text autoencoder from transformer for fast encoding language representation

arXiv:2111.02844v1
Originality Incremental advance
AI Analysis

This work addresses the high computational cost of BERT for natural language processing, making it more accessible, though it is incremental as it builds on existing transformer architectures.

The authors tackled the computational inefficiency of BERT by proposing a transformer-based autoencoder with a window masking mechanism, achieving O(n) complexity instead of O(n²) and demonstrating higher accuracy in SMS classification and semantic similarity tasks.

In recent years BERT shows apparent advantages and great potential in natural language processing tasks. However, both training and applying BERT requires intensive time and resources for computing contextual language representations, which hinders its universality and applicability. To overcome this bottleneck, we propose a deep bidirectional language model by using window masking mechanism at attention layer. This work computes contextual language representations without random masking as does in BERT and maintains the deep bidirectional architecture like BERT. To compute the same sentence representation, our method shows O(n) complexity less compared to other transformer-based models with O($n^2$). To further demonstrate its superiority, computing context language representations on CPU environments is conducted, by using the embeddings from the proposed method, logistic regression shows much higher accuracy in terms of SMS classification. Moverover, the proposed method also achieves significant higher performance in semantic similarity tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes