CVAILGMLApr 7, 2021

Differentiable Patch Selection for Image Recognition

arXiv:2104.03059v1121 citations
AI Analysis

This method addresses efficiency issues in image recognition for applications like traffic sign and fine-grained recognition, offering a flexible, end-to-end trainable solution.

The paper tackles the problem of high memory and compute requirements for processing high-resolution images in neural networks by proposing a differentiable Top-K operator to select the most relevant patches, enabling efficient image recognition without bounding box annotations during training.

Neural Networks require large amounts of memory and compute to process high resolution images, even when only a small part of the image is actually informative for the task at hand. We propose a method based on a differentiable Top-K operator to select the most relevant parts of the input to efficiently process high resolution images. Our method may be interfaced with any downstream neural network, is able to aggregate information from different patches in a flexible way, and allows the whole model to be trained end-to-end using backpropagation. We show results for traffic sign recognition, inter-patch relationship reasoning, and fine-grained recognition without using object/part bounding box annotations during training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes