CVApr 22, 2021

Heterogeneous Grid Convolution for Adaptive, Efficient, and Controllable Computation

arXiv:2104.11176v15 citations
Originality Incremental advance
AI Analysis

This work addresses efficiency and performance issues in image understanding tasks for computer vision applications, representing an incremental improvement with novel method integration.

The paper tackled the problem of inefficient computation in convolutional architectures by proposing a heterogeneous grid convolution that builds a data-adaptive graph representation, resulting in over 90% reduction in floating-point operations for semantic segmentation and achieving state-of-the-art performance for road extraction.

This paper proposes a novel heterogeneous grid convolution that builds a graph-based image representation by exploiting heterogeneity in the image content, enabling adaptive, efficient, and controllable computations in a convolutional architecture. More concretely, the approach builds a data-adaptive graph structure from a convolutional layer by a differentiable clustering method, pools features to the graph, performs a novel direction-aware graph convolution, and unpool features back to the convolutional layer. By using the developed module, the paper proposes heterogeneous grid convolutional networks, highly efficient yet strong extension of existing architectures. We have evaluated the proposed approach on four image understanding tasks, semantic segmentation, object localization, road extraction, and salient object detection. The proposed method is effective on three of the four tasks. Especially, the method outperforms a strong baseline with more than 90% reduction in floating-point operations for semantic segmentation, and achieves the state-of-the-art result for road extraction. We will share our code, model, and data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes