LG CLMay 28, 2023

A Quantitative Review on Language Model Efficiency Research

arXiv:2306.01768v12.0

Originality Synthesis-oriented

AI Analysis

It addresses a gap in the literature for researchers in NLP and ML by synthesizing existing data, but it is incremental as it builds on prior reviews without introducing new methods.

This paper tackles the lack of quantitative analysis in language model efficiency research by conducting a meta-analysis on efficient Transformers and state space models, providing a review and suggestions for future work without presenting new experimental results.

Language models (LMs) are being scaled and becoming powerful. Improving their efficiency is one of the core research topics in neural information processing systems. Tay et al. (2022) provided a comprehensive overview of efficient Transformers that have become an indispensable staple in the field of NLP. However, in the section of "On Evaluation", they left an open question "which fundamental efficient Transformer one should consider," answered by "still a mystery" because "many research papers select their own benchmarks." Unfortunately, there was not quantitative analysis about the performances of Transformers on any benchmarks. Moreover, state space models (SSMs) have demonstrated their abilities of modeling long-range sequences with non-attention mechanisms, which were not discussed in the prior review. This article makes a meta analysis on the results from a set of papers on efficient Transformers as well as those on SSMs. It provides a quantitative review on LM efficiency research and gives suggestions for future research.

View on arXiv PDF

Similar