A3C-S: Automated Agent Accelerator Co-Search towards Efficient Deep Reinforcement Learning
This work addresses the problem of efficient DRL deployment for real-time control applications on intelligent devices, representing a novel approach to hardware-software co-design.
The paper tackles the challenge of deploying deep reinforcement learning (DRL) agents on resource-limited devices by proposing an automated co-search framework that optimizes both agent performance and hardware efficiency, achieving superior results over state-of-the-art techniques.
Driven by the explosive interest in applying deep reinforcement learning (DRL) agents to numerous real-time control and decision-making applications, there has been a growing demand to deploy DRL agents to empower daily-life intelligent devices, while the prohibitive complexity of DRL stands at odds with limited on-device resources. In this work, we propose an Automated Agent Accelerator Co-Search (A3C-S) framework, which to our best knowledge is the first to automatically co-search the optimally matched DRL agents and accelerators that maximize both test scores and hardware efficiency. Extensive experiments consistently validate the superiority of our A3C-S over state-of-the-art techniques.