CLAILGDec 19, 2023

ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for Accelerating Language Models Inference

arXiv:2312.11882v228 citationsh-index: 8AAAI
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in accelerating language model inference, offering an incremental improvement over existing early exiting methods.

The authors tackled the inconsistency between training and inference in early exiting methods for language models, proposing ConsistentEE, which uses reinforcement learning and a hardness-guided reward to align training with inference, resulting in improved performance on natural language tasks.

Early Exiting is one of the most popular methods to achieve efficient inference. Current early exiting methods adopt the (weighted) sum of the cross entropy loss of all internal classifiers during training, imposing all these classifiers to predict all instances correctly. However, during inference, as long as one internal classifier predicts an instance correctly, it can accelerate without losing accuracy. Thus, there is a notable gap between training and inference. We propose ConsistentEE, an early exiting method that is consistent in training and inference. ConsistentEE formulates the early exiting process as a reinforcement learning problem. A policy network is added to decide whether an instance should exit or continue. The training objective of ConsistentEE only require each instance to be predicted correctly by one internal classifier. Additionally, we introduce the concept Memorize Layer to measure the hardness of an instance. We incorporate memorized layer into reward function design, which allows "easy" instances to focus more on acceleration while "hard" instances to focus more on accuracy. Experimental results show that our method outperforms other baselines on various natural language understanding and generation tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes