In value-based deep reinforcement learning, a pruned network is a good network
This work addresses parameter inefficiency in deep reinforcement learning, offering a method to enhance performance with reduced computational resources, though it is incremental as it builds on existing pruning techniques.
The paper tackles the problem of inefficient parameter use in deep reinforcement learning agents by applying gradual magnitude pruning to value-based networks, resulting in dramatic performance improvements using only a small fraction of the full parameters.
Recent work has shown that deep reinforcement learning agents have difficulty in effectively using their network parameters. We leverage prior insights into the advantages of sparse training techniques and demonstrate that gradual magnitude pruning enables value-based agents to maximize parameter effectiveness. This results in networks that yield dramatic performance improvements over traditional networks, using only a small fraction of the full network parameters.