CL AI LGMay 24, 2024

Large Language Model Pruning

Hanjuan Huang, Hao-Jia Song, Hsing-Kuo Pao

arXiv:2406.00030v14.23 citationsh-index: 4

Originality Incremental advance

AI Analysis

This work addresses the challenge of making LLMs more efficient and trustworthy for NLP applications, though it appears incremental as it builds on existing pruning methods.

The paper tackles the problem of large language models (LLMs) suffering from issues like overfitting and device limitations by proposing a model pruning technique that emphasizes explainability and uses mutual information-based estimation to eliminate redundant neurons, demonstrating superiority over state-of-the-art models.

We surely enjoy the larger the better models for their superior performance in the last couple of years when both the hardware and software support the birth of such extremely huge models. The applied fields include text mining and others. In particular, the success of LLMs on text understanding and text generation draws attention from researchers who have worked on NLP and related areas for years or even decades. On the side, LLMs may suffer from problems like model overfitting, hallucination, and device limitation to name a few. In this work, we suggest a model pruning technique specifically focused on LLMs. The proposed methodology emphasizes the explainability of deep learning models. By having the theoretical foundation, we obtain a trustworthy deep model so that huge models with a massive number of model parameters become not quite necessary. A mutual information-based estimation is adopted to find neurons with redundancy to eliminate. Moreover, an estimator with well-tuned parameters helps to find precise estimation to guide the pruning procedure. At the same time, we also explore the difference between pruning on large-scale models vs. pruning on small-scale models. The choice of pruning criteria is sensitive in small models but not for large-scale models. It is a novel finding through this work. Overall, we demonstrate the superiority of the proposed model to the state-of-the-art models.

View on arXiv PDF

Similar