Deep Unfolding with Kernel-based Quantization in MIMO Detection
This work addresses energy-efficient model deployment for MIMO detection in edge computing, representing an incremental improvement over existing quantization methods.
The paper tackled the challenge of deploying deep unfolding models for MIMO detection on resource-constrained edge devices by proposing a kernel-based adaptive quantization framework, which outperformed traditional methods in accuracy and reduced inference latency.
The development of edge computing places critical demands on energy-efficient model deployment for multiple-input multiple-output (MIMO) detection tasks. Deploying deep unfolding models such as PGD-Nets and ADMM-Nets into resource-constrained edge devices using quantization methods is challenging. Existing quantization methods based on quantization aware training (QAT) suffer from performance degradation due to their reliance on parametric distribution assumption of activations and static quantization step sizes. To address these challenges, this paper proposes a novel kernel-based adaptive quantization (KAQ) framework for deep unfolding networks. By utilizing a joint kernel density estimation (KDE) and maximum mean discrepancy (MMD) approach to align activation distributions between full-precision and quantized models, the need for prior distribution assumptions is eliminated. Additionally, a dynamic step size updating method is introduced to adjust the quantization step size based on the channel conditions of wireless networks. Extensive simulations demonstrate that the accuracy of proposed KAQ framework outperforms traditional methods and successfully reduces the model's inference latency.