CVOct 3, 2022

Feature Embedding by Template Matching as a ResNet Block

arXiv:2210.00992v21 citationsh-index: 27
AI Analysis

This work addresses the problem of enhancing local feature embedding in neural networks for image classification, offering an incremental improvement over existing architectures.

The paper reformulates convolution blocks as feature selection via template matching and introduces a residual block that uses label information to enforce semantically meaningful local feature embeddings, resulting in substantial performance improvements on image classification benchmarks.

Convolution blocks serve as local feature extractors and are the key to success of the neural networks. To make local semantic feature embedding rather explicit, we reformulate convolution blocks as feature selection according to the best matching kernel. In this manner, we show that typical ResNet blocks indeed perform local feature embedding via template matching once batch normalization (BN) followed by a rectified linear unit (ReLU) is interpreted as arg-max optimizer. Following this perspective, we tailor a residual block that explicitly forces semantically meaningful local feature embedding through using label information. Specifically, we assign a feature vector to each local region according to the classes that the corresponding region matches. We evaluate our method on three popular benchmark datasets with several architectures for image classification and consistently show that our approach substantially improves the performance of the baseline architectures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes