CVJul 21, 2022

NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition

Amazon
arXiv:2207.10388v10.3827 citationsh-index: 98
AI Analysis55

This addresses the challenge of accurate video recognition under low computation costs for AI systems, representing an incremental improvement over prior adaptive inference methods.

The paper tackles the problem of efficient video recognition by proposing NSNet, a method that suppresses non-salient frames to reduce computation costs, achieving state-of-the-art accuracy-efficiency trade-offs and 2.4-4.3x faster inference speeds than existing methods.

It is challenging for artificial intelligence systems to achieve accurate video recognition under the scenario of low computation costs. Adaptive inference based efficient video recognition methods typically preview videos and focus on salient parts to reduce computation costs. Most existing works focus on complex networks learning with video classification based objectives. Taking all frames as positive samples, few of them pay attention to the discrimination between positive samples (salient frames) and negative samples (non-salient frames) in supervisions. To fill this gap, in this paper, we propose a novel Non-saliency Suppression Network (NSNet), which effectively suppresses the responses of non-salient frames. Specifically, on the frame level, effective pseudo labels that can distinguish between salient and non-salient frames are generated to guide the frame saliency learning. On the video level, a temporal attention module is learned under dual video-level supervisions on both the salient and the non-salient representations. Saliency measurements from both two levels are combined for exploitation of multi-granularity complementary information. Extensive experiments conducted on four well-known benchmarks verify our NSNet not only achieves the state-of-the-art accuracy-efficiency trade-off but also present a significantly faster (2.4~4.3x) practical inference speed than state-of-the-art methods. Our project page is at https://lawrencexia2008.github.io/projects/nsnet .

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes