CVLGMLAug 18, 2023

Generalized Sum Pooling for Metric Learning

arXiv:2308.09228v210 citationsh-index: 23Has Code
Originality Incremental advance
AI Analysis

This work addresses a specific architectural bottleneck in metric learning for computer vision applications, offering an incremental improvement over existing pooling methods.

The paper tackles the limitation of global average pooling (GAP) in deep metric learning by proposing a learnable generalized sum pooling (GSP) method that selects subsets of semantic entities and weights their importance, achieving improved performance on four popular benchmarks.

A common architectural choice for deep metric learning is a convolutional neural network followed by global average pooling (GAP). Albeit simple, GAP is a highly effective way to aggregate information. One possible explanation for the effectiveness of GAP is considering each feature vector as representing a different semantic entity and GAP as a convex combination of them. Following this perspective, we generalize GAP and propose a learnable generalized sum pooling method (GSP). GSP improves GAP with two distinct abilities: i) the ability to choose a subset of semantic entities, effectively learning to ignore nuisance information, and ii) learning the weights corresponding to the importance of each entity. Formally, we propose an entropy-smoothed optimal transport problem and show that it is a strict generalization of GAP, i.e., a specific realization of the problem gives back GAP. We show that this optimization problem enjoys analytical gradients enabling us to use it as a direct learnable replacement for GAP. We further propose a zero-shot loss to ease the learning of GSP. We show the effectiveness of our method with extensive evaluations on 4 popular metric learning benchmarks. Code is available at: GSP-DML Framework

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes