CVJul 29, 2021

Bridging Gap between Image Pixels and Semantics via Supervision: A Survey

arXiv:2107.13757v310 citations
Originality Synthesis-oriented
AI Analysis

It addresses the long-standing issue of connecting low-level image features to high-level semantics for researchers and practitioners in computer vision, but is incremental as it synthesizes existing work.

This survey reviews the semantic gap problem in images and claims that supervised learning is the primary method for bridging it, illustrated through examples in object detection and metric learning for content-based image retrieval.

The fact that there exists a gap between low-level features and semantic meanings of images, called the semantic gap, is known for decades. Resolution of the semantic gap is a long standing problem. The semantic gap problem is reviewed and a survey on recent efforts in bridging the gap is made in this work. Most importantly, we claim that the semantic gap is primarily bridged through supervised learning today. Experiences are drawn from two application domains to illustrate this point: 1) object detection and 2) metric learning for content-based image retrieval (CBIR). To begin with, this paper offers a historical retrospective on supervision, makes a gradual transition to the modern data-driven methodology and introduces commonly used datasets. Then, it summarizes various supervision methods to bridge the semantic gap in the context of object detection and metric learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes