Fast and Accurate Deep Bidirectional Language Representations for Unsupervised Learning
This addresses the computational inefficiency of BERT for unsupervised learning, offering a faster alternative for tasks like reranking and semantic similarity.
The paper tackles the limitation of BERT requiring repetitive inference for unsupervised tasks by proposing Transformer-based Text Autoencoder (T-TA), which achieves over six times faster performance in reranking and twelve times faster in semantic similarity tasks while maintaining competitive or better accuracy.
Even though BERT achieves successful performance improvements in various supervised learning tasks, applying BERT for unsupervised tasks still holds a limitation that it requires repetitive inference for computing contextual language representations. To resolve the limitation, we propose a novel deep bidirectional language model called Transformer-based Text Autoencoder (T-TA). The T-TA computes contextual language representations without repetition and has benefits of the deep bidirectional architecture like BERT. In run-time experiments on CPU environments, the proposed T-TA performs over six times faster than the BERT-based model in the reranking task and twelve times faster in the semantic similarity task. Furthermore, the T-TA shows competitive or even better accuracies than those of BERT on the above tasks.