CVNov 21, 2017

Non-local Neural Networks

arXiv:1711.07971v310142 citationsHas Code
Originality Highly original
AI Analysis

This addresses the limitation of local processing in convolutional and recurrent networks for computer vision tasks, offering a novel approach to model long-range interactions.

The paper tackled the problem of capturing long-range dependencies in neural networks by introducing non-local operations as a generic building block, which improved video classification on Kinetics and Charades datasets and enhanced object detection/segmentation and pose estimation on COCO tasks.

Both convolutional and recurrent operations are building blocks that process one local neighborhood at a time. In this paper, we present non-local operations as a generic family of building blocks for capturing long-range dependencies. Inspired by the classical non-local means method in computer vision, our non-local operation computes the response at a position as a weighted sum of the features at all positions. This building block can be plugged into many computer vision architectures. On the task of video classification, even without any bells and whistles, our non-local models can compete or outperform current competition winners on both Kinetics and Charades datasets. In static image recognition, our non-local models improve object detection/segmentation and pose estimation on the COCO suite of tasks. Code is available at https://github.com/facebookresearch/video-nonlocal-net .

Code Implementations32 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes