ARLGFeb 24, 2021

FIXAR: A Fixed-Point Deep Reinforcement Learning Platform with Quantization-Aware Training and Adaptive Parallelism

arXiv:2102.12103v124 citations
Originality Incremental advance
AI Analysis

This work addresses efficiency bottlenecks for researchers and practitioners in deep reinforcement learning, offering a hardware-software co-design solution that is incremental in optimizing existing methods.

The paper tackles the challenge of improving efficiency in deep reinforcement learning by introducing FIXAR, a platform using fixed-point data types and quantization-aware training, achieving 2.7 times faster training throughput and 15.4 times higher energy efficiency compared to CPU-GPU platforms without accuracy loss.

In this paper, we present a deep reinforcement learning platform named FIXAR which employs fixed-point data types and arithmetic units for the first time using a SW/HW co-design approach. Starting from 32-bit fixed-point data, Quantization-Aware Training (QAT) reduces its data precision based on the range of activations and performs retraining to minimize the reward degradation. FIXAR proposes the adaptive array processing core composed of configurable processing elements to support both intra-layer parallelism and intra-batch parallelism for high-throughput inference and training. Finally, FIXAR was implemented on Xilinx U50 and achieves 25293.3 inferences per second (IPS) training throughput and 2638.0 IPS/W accelerator efficiency, which is 2.7 times faster and 15.4 times more energy efficient than those of the CPU-GPU platform without any accuracy degradation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes