CVJul 4, 2018

Deep Cross-modality Adaptation via Semantics Preserving Adversarial Learning for Sketch-based 3D Shape Retrieval

arXiv:1807.01806v18.756 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of cross-modality retrieval for computer vision applications, but it is incremental as it builds on existing adversarial and metric learning techniques.

The paper tackled the problem of retrieving 3D shapes using 2D sketches by proposing a deep cross-modality adaptation model that uses adversarial learning to align feature spaces, achieving superior retrieval performance on SHREC 2013 and 2014 datasets compared to state-of-the-art methods.

Due to the large cross-modality discrepancy between 2D sketches and 3D shapes, retrieving 3D shapes by sketches is a significantly challenging task. To address this problem, we propose a novel framework to learn a discriminative deep cross-modality adaptation model in this paper. Specifically, we first separately adopt two metric networks, following two deep convolutional neural networks (CNNs), to learn modality-specific discriminative features based on an importance-aware metric learning method. Subsequently, we explicitly introduce a cross-modality transformation network to compensate for the divergence between two modalities, which can transfer features of 2D sketches to the feature space of 3D shapes. We develop an adversarial learning based method to train the transformation model, by simultaneously enhancing the holistic correlations between data distributions of two modalities, and mitigating the local semantic divergences through minimizing a cross-modality mean discrepancy term. Experimental results on the SHREC 2013 and SHREC 2014 datasets clearly show the superior retrieval performance of our proposed model, compared to the state-of-the-art approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes