RNM-TD3: N:M Semi-structured Sparse Reinforcement Learning From Scratch
This work addresses the challenge of enabling hardware acceleration in sparse reinforcement learning without performance degradation, which is incremental as it adapts existing structured sparsity techniques to RL.
The paper tackles the problem of balancing compression, performance, and hardware efficiency in deep reinforcement learning by introducing N:M structured sparsity, showing that RNM-TD3 outperforms its dense counterpart at 50%-75% sparsity with up to a 14% increase in performance on the Ant environment.
Sparsity is a well-studied technique for compressing deep neural networks (DNNs) without compromising performance. In deep reinforcement learning (DRL), neural networks with up to 5% of their original weights can still be trained with minimal performance loss compared to their dense counterparts. However, most existing methods rely on unstructured fine-grained sparsity, which limits hardware acceleration opportunities due to irregular computation patterns. Structured coarse-grained sparsity enables hardware acceleration, yet typically degrades performance and increases pruning complexity. In this work, we present, to the best of our knowledge, the first study on N:M structured sparsity in RL, which balances compression, performance, and hardware efficiency. Our framework enforces row-wise N:M sparsity throughout training for all networks in off-policy RL (TD3), maintaining compatibility with accelerators that support N:M sparse matrix operations. Experiments on continuous-control benchmarks show that RNM-TD3, our N:M sparse agent, outperforms its dense counterpart at 50%-75% sparsity (e.g., 2:4 and 1:4), achieving up to a 14% increase in performance at 2:4 sparsity on the Ant environment. RNM-TD3 remains competitive even at 87.5% sparsity (1:8), while enabling potential training speedups.