CLAILGNEOct 30, 2025

Elastic Architecture Search for Efficient Language Models

arXiv:2510.27037v1ICME
Originality Incremental advance
AI Analysis

This work addresses efficiency concerns for deploying language models in resource-constrained environments, representing an incremental improvement over existing NAS approaches.

The paper tackles the problem of high computational and memory costs in large pre-trained language models by introducing the Elastic Language Model (ELM), a neural architecture search method that discovers compact models, which significantly outperform existing methods on masked and causal language modeling tasks.

As large pre-trained language models become increasingly critical to natural language understanding (NLU) tasks, their substantial computational and memory requirements have raised significant economic and environmental concerns. Addressing these challenges, this paper introduces the Elastic Language Model (ELM), a novel neural architecture search (NAS) method optimized for compact language models. ELM extends existing NAS approaches by introducing a flexible search space with efficient transformer blocks and dynamic modules for dimension and head number adjustment. These innovations enhance the efficiency and flexibility of the search process, which facilitates more thorough and effective exploration of model architectures. We also introduce novel knowledge distillation losses that preserve the unique characteristics of each block, in order to improve the discrimination between architectural choices during the search process. Experiments on masked language modeling and causal language modeling tasks demonstrate that models discovered by ELM significantly outperform existing methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes