CVLGDec 19, 2017

End-to-end weakly-supervised semantic alignment

arXiv:1712.06861v2189 citations
Originality Incremental advance
AI Analysis

This addresses the problem of dense semantic correspondence for computer vision researchers, offering an incremental improvement by reducing manual annotation needs.

The paper tackles semantic alignment between images of the same object category by proposing an end-to-end trainable CNN architecture using weak image-level supervision, achieving state-of-the-art performance on multiple benchmarks.

We tackle the task of semantic alignment where the goal is to compute dense semantic correspondence aligning two images depicting objects of the same category. This is a challenging task due to large intra-class variation, changes in viewpoint and background clutter. We present the following three principal contributions. First, we develop a convolutional neural network architecture for semantic alignment that is trainable in an end-to-end manner from weak image-level supervision in the form of matching image pairs. The outcome is that parameters are learnt from rich appearance variation present in different but semantically related images without the need for tedious manual annotation of correspondences at training time. Second, the main component of this architecture is a differentiable soft inlier scoring module, inspired by the RANSAC inlier scoring procedure, that computes the quality of the alignment based on only geometrically consistent correspondences thereby reducing the effect of background clutter. Third, we demonstrate that the proposed approach achieves state-of-the-art performance on multiple standard benchmarks for semantic alignment.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes