CVAug 21, 2017

STNet: Selective Tuning of Convolutional Networks for Object Localization

arXiv:1708.06418v116 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of improving object localization in computer vision without full supervision, though it appears incremental as it builds on existing attention models.

The authors tackled the problem of weakly-supervised object localization by proposing STNet, which uses bottom-up and top-down processing to selectively tune convolutional networks, achieving state-of-the-art results on the ImageNet benchmark.

Visual attention modeling has recently gained momentum in developing visual hierarchies provided by Convolutional Neural Networks. Despite recent successes of feedforward processing on the abstraction of concepts form raw images, the inherent nature of feedback processing has remained computationally controversial. Inspired by the computational models of covert visual attention, we propose the Selective Tuning of Convolutional Networks (STNet). It is composed of both streams of Bottom-Up and Top-Down information processing to selectively tune the visual representation of Convolutional networks. We experimentally evaluate the performance of STNet for the weakly-supervised localization task on the ImageNet benchmark dataset. We demonstrate that STNet not only successfully surpasses the state-of-the-art results but also generates attention-driven class hypothesis maps.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes