CVApr 17, 2023

Learning Geometry-aware Representations by Sketching

arXiv:2304.08204v112 citationsh-index: 45
Originality Highly original
AI Analysis

This addresses the need for geometry-aware representations in computer vision, offering a novel approach that enhances performance in downstream tasks, though it appears incremental in its application to existing datasets.

The paper tackles the problem of learning geometric representations for vision tasks by proposing a method that converts images into colored strokes to explicitly incorporate geometric information, showing improved performance in object attribute classification and domain transfer on datasets like CLEVR and STL-10.

Understanding geometric concepts, such as distance and shape, is essential for understanding the real world and also for many vision tasks. To incorporate such information into a visual representation of a scene, we propose learning to represent the scene by sketching, inspired by human behavior. Our method, coined Learning by Sketching (LBS), learns to convert an image into a set of colored strokes that explicitly incorporate the geometric information of the scene in a single inference step without requiring a sketch dataset. A sketch is then generated from the strokes where CLIP-based perceptual loss maintains a semantic similarity between the sketch and the image. We show theoretically that sketching is equivariant with respect to arbitrary affine transformations and thus provably preserves geometric information. Experimental results show that LBS substantially improves the performance of object attribute classification on the unlabeled CLEVR dataset, domain transfer between CLEVR and STL-10 datasets, and for diverse downstream tasks, confirming that LBS provides rich geometric information.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes