CVJul 7, 2020

On Learning Semantic Representations for Million-Scale Free-Hand Sketches

arXiv:2007.04101v14 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of sketch understanding for computer vision applications, offering incremental improvements in retrieval and recognition tasks.

The paper tackles the challenge of learning semantic representations for million-scale free-hand sketches by proposing a dual-branch CNN-RNN architecture that encodes static and temporal patterns, and applies it to hashing retrieval and zero-shot recognition, achieving state-of-the-art performance on million-scale datasets.

In this paper, we study learning semantic representations for million-scale free-hand sketches. This is highly challenging due to the domain-unique traits of sketches, e.g., diverse, sparse, abstract, noisy. We propose a dual-branch CNNRNN network architecture to represent sketches, which simultaneously encodes both the static and temporal patterns of sketch strokes. Based on this architecture, we further explore learning the sketch-oriented semantic representations in two challenging yet practical settings, i.e., hashing retrieval and zero-shot recognition on million-scale sketches. Specifically, we use our dual-branch architecture as a universal representation framework to design two sketch-specific deep models: (i) We propose a deep hashing model for sketch retrieval, where a novel hashing loss is specifically designed to accommodate both the abstract and messy traits of sketches. (ii) We propose a deep embedding model for sketch zero-shot recognition, via collecting a large-scale edge-map dataset and proposing to extract a set of semantic vectors from edge-maps as the semantic knowledge for sketch zero-shot domain alignment. Both deep models are evaluated by comprehensive experiments on million-scale sketches and outperform the state-of-the-art competitors.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes