Honey, I Shrunk The Actor: A Case Study on Preserving Performance with Smaller Actors in Actor-Critic RL
This work addresses resource constraints in RL deployments, such as for applications with limited computing power or multiple actors, though it is incremental as it builds on existing actor-critic methods.
The paper tackles the problem of reducing network size in actor-critic reinforcement learning by exploring independent actor and critic architectures, showing that smaller actors can achieve comparable performance with up to 99% weight reduction and an average of 77% reduction across multiple tasks.
Actors and critics in actor-critic reinforcement learning algorithms are functionally separate, yet they often use the same network architectures. This case study explores the performance impact of network sizes when considering actor and critic architectures independently. By relaxing the assumption of architectural symmetry, it is often possible for smaller actors to achieve comparable policy performance to their symmetric counterparts. Our experiments show up to 99% reduction in the number of network weights with an average reduction of 77% over multiple actor-critic algorithms on 9 independent tasks. Given that reducing actor complexity results in a direct reduction of run-time inference cost, we believe configurations of actors and critics are aspects of actor-critic design that deserve to be considered independently, particularly in resource-constrained applications or when deploying multiple actors simultaneously.