AIOct 10, 2017

Deep Reinforcement Learning: Framework, Applications, and Embedded Implementations

arXiv:1710.03792v148 citations
Originality Synthesis-oriented
AI Analysis

This work addresses optimization problems in cyber-physical systems, but it is incremental as it applies existing DRL methods to new domains and hardware.

The paper presents a general deep reinforcement learning (DRL) framework and validates its effectiveness in three cyber-physical applications: cloud computing resource allocation, residential smart grid task scheduling, and building HVAC system optimal control, with hardware implementations showing significant improvements in area efficiency and power consumption.

The recent breakthroughs of deep reinforcement learning (DRL) technique in Alpha Go and playing Atari have set a good example in handling large state and actions spaces of complicated control problems. The DRL technique is comprised of (i) an offline deep neural network (DNN) construction phase, which derives the correlation between each state-action pair of the system and its value function, and (ii) an online deep Q-learning phase, which adaptively derives the optimal action and updates value estimates. In this paper, we first present the general DRL framework, which can be widely utilized in many applications with different optimization objectives. This is followed by the introduction of three specific applications: the cloud computing resource allocation problem, the residential smart grid task scheduling problem, and building HVAC system optimal control problem. The effectiveness of the DRL technique in these three cyber-physical applications have been validated. Finally, this paper investigates the stochastic computing-based hardware implementations of the DRL framework, which consumes a significant improvement in area efficiency and power consumption compared with binary-based implementation counterparts.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes