Channel Pruning In Quantization-aware Training: An Adaptive Projection-gradient Descent-shrinkage-splitting Method
This work addresses model compression for efficient deployment in resource-constrained environments, but it appears incremental as it builds on existing penalty-based pruning and quantization techniques.
The paper tackles the problem of integrating channel pruning into quantization-aware training by proposing an adaptive projection-gradient descent-shrinkage-splitting method (APGDSSM), which concurrently searches weights in quantized and sparse subspaces and achieves compression with improved efficiency, though no concrete numbers are provided in the abstract.
We propose an adaptive projection-gradient descent-shrinkage-splitting method (APGDSSM) to integrate penalty based channel pruning into quantization-aware training (QAT). APGDSSM concurrently searches weights in both the quantized subspace and the sparse subspace. APGDSSM uses shrinkage operator and a splitting technique to create sparse weights, as well as the Group Lasso penalty to push the weight sparsity into channel sparsity. In addition, we propose a novel complementary transformed l1 penalty to stabilize the training for extreme compression.