CVAIAug 5, 2021

Unifying Nonlocal Blocks for Neural Networks

arXiv:2108.02451v328 citations
AI Analysis

This work addresses the problem of enhancing long-range dependency modeling in neural networks for computer vision researchers, offering an incremental improvement over existing nonlocal blocks.

The paper tackled the limitation of nonlocal blocks in capturing structured long-range dependencies in computer vision by proposing a unified graph filter interpretation and a new spectral nonlocal block, resulting in clear-cut improvements across tasks like image classification and action recognition.

The nonlocal-based blocks are designed for capturing long-range spatial-temporal dependencies in computer vision tasks. Although having shown excellent performance, they still lack the mechanism to encode the rich, structured information among elements in an image or video. In this paper, to theoretically analyze the property of these nonlocal-based blocks, we provide a new perspective to interpret them, where we view them as a set of graph filters generated on a fully-connected graph. Specifically, when choosing the Chebyshev graph filter, a unified formulation can be derived for explaining and analyzing the existing nonlocal-based blocks (e.g., nonlocal block, nonlocal stage, double attention block). Furthermore, by concerning the property of spectral, we propose an efficient and robust spectral nonlocal block, which can be more robust and flexible to catch long-range dependencies when inserted into deep neural networks than the existing nonlocal blocks. Experimental results demonstrate the clear-cut improvements and practical applicabilities of our method on image classification, action recognition, semantic segmentation, and person re-identification tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes