LG AI CVJan 11, 2024

Knowledge Translation: A New Pathway for Model Compression

Wujie Sun, Defang Chen, Jiawei Chen, Yan Feng, Chun Chen, Can Wang

arXiv:2401.05772v12.61 citationsh-index: 23Has Code

Originality Incremental advance

AI Analysis

This addresses the need for efficient deep learning models by offering a new compression pathway, but it is incremental as it builds on existing compression ideas with a novel translation-inspired approach.

The paper tackles the problem of model compression by introducing Knowledge Translation (KT), a framework that trains a translation model to generate compressed parameters from a larger model without retraining, and demonstrates its feasibility on MNIST.

Deep learning has witnessed significant advancements in recent years at the cost of increasing training, inference, and model storage overhead. While existing model compression methods strive to reduce the number of model parameters while maintaining high accuracy, they inevitably necessitate the re-training of the compressed model or impose architectural constraints. To overcome these limitations, this paper presents a novel framework, termed \textbf{K}nowledge \textbf{T}ranslation (KT), wherein a ``translation'' model is trained to receive the parameters of a larger model and generate compressed parameters. The concept of KT draws inspiration from language translation, which effectively employs neural networks to convert different languages, maintaining identical meaning. Accordingly, we explore the potential of neural networks to convert models of disparate sizes, while preserving their functionality. We propose a comprehensive framework for KT, introduce data augmentation strategies to enhance model performance despite restricted training data, and successfully demonstrate the feasibility of KT on the MNIST dataset. Code is available at \url{https://github.com/zju-SWJ/KT}.

View on arXiv PDF Code

Similar