LGAICLDec 15, 2024

SEE: Sememe Entanglement Encoding for Transformer-bases Models Compression

arXiv:2412.12204v1h-index: 5
Originality Incremental advance
AI Analysis

This addresses the problem of deploying large language models in resource-constrained scenarios, representing an incremental improvement in model compression techniques.

The paper tackles the high storage and computational costs of transformer-based large language models by proposing the Sememe Entanglement Encoding (SEE) algorithm, which compresses models using low-rank approximation and expert knowledge, achieving stable performance with reduced parameters and costs.

Transformer-based large language models exhibit groundbreaking capabilities, but their storage and computational costs are prohibitively high, limiting their application in resource-constrained scenarios. An effective approach is to eliminate redundant model parameters and computational costs while incorporating efficient expert-derived knowledge structures to achieve a balance between compression and performance. Therefore, we propose the \textit{Sememe Entanglement Encoding (SEE)} algorithm. Guided by expert prior knowledge, the model is compressed through the low-rank approximation idea. In Entanglement Embedding, basic semantic units such as sememes are represented as low-dimensional vectors, and then reconstructed into high-dimensional word embeddings through the combination of generalized quantum entanglement. We adapt the Sememe Entanglement Encoding algorithm to transformer-based models of different magnitudes. Experimental results indicate that our approach achieves stable performance while compressing model parameters and computational costs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes