LGAIJul 5, 2024

The Impact of Quantization and Pruning on Deep Reinforcement Learning Models

arXiv:2407.04803v18 citationsh-index: 3
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of deploying deep reinforcement learning in resource-constrained environments, but it is incremental as it applies existing compression methods to DRL without major breakthroughs.

The study investigated how quantization and pruning affect deep reinforcement learning models, finding that while these compression methods reduce model size, they generally do not improve energy efficiency, with trade-offs identified across performance factors like average return and inference time.

Deep reinforcement learning (DRL) has achieved remarkable success across various domains, such as video games, robotics, and, recently, large language models. However, the computational costs and memory requirements of DRL models often limit their deployment in resource-constrained environments. The challenge underscores the urgent need to explore neural network compression methods to make RDL models more practical and broadly applicable. Our study investigates the impact of two prominent compression methods, quantization and pruning on DRL models. We examine how these techniques influence four performance factors: average return, memory, inference time, and battery utilization across various DRL algorithms and environments. Despite the decrease in model size, we identify that these compression techniques generally do not improve the energy efficiency of DRL models, but the model size decreases. We provide insights into the trade-offs between model compression and DRL performance, offering guidelines for deploying efficient DRL models in resource-constrained settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes