AccDecoder: Accelerated Decoding for Neural-enhanced Video Analytics
This work addresses the challenge of real-time, high-accuracy video analytics for surveillance systems, offering an incremental improvement over existing quality enhancement methods by reducing latency.
The paper tackles the problem of low-quality video streams degrading neural network-based video analytics by introducing AccDecoder, an accelerated decoder that adaptively selects frames for neural super-resolution enhancement, achieving 6-21% accuracy improvement and 20-80% latency reduction compared to baselines.
The quality of the video stream is key to neural network-based video analytics. However, low-quality video is inevitably collected by existing surveillance systems because of poor quality cameras or over-compressed/pruned video streaming protocols, e.g., as a result of upstream bandwidth limit. To address this issue, existing studies use quality enhancers (e.g., neural super-resolution) to improve the quality of videos (e.g., resolution) and eventually ensure inference accuracy. Nevertheless, directly applying quality enhancers does not work in practice because it will introduce unacceptable latency. In this paper, we present AccDecoder, a novel accelerated decoder for real-time and neural-enhanced video analytics. AccDecoder can select a few frames adaptively via Deep Reinforcement Learning (DRL) to enhance the quality by neural super-resolution and then up-scale the unselected frames that reference them, which leads to 6-21% accuracy improvement. AccDecoder provides efficient inference capability via filtering important frames using DRL for DNN-based inference and reusing the results for the other frames via extracting the reference relationship among frames and blocks, which results in a latency reduction of 20-80% than baselines.