CVMar 10, 2020

Cross-modal Multi-task Learning for Graphic Recognition of Caricature Face

arXiv:2003.05787v1
AI Analysis

This work addresses a cross-modal recognition challenge for caricature analysis, offering an incremental improvement over existing methods in a domain-specific application.

The paper tackles the problem of caricature-visual face recognition, which suffers from extreme non-rigid distortions, by proposing a dynamic multi-task learning method that learns task weights based on importance, achieving state-of-the-art performance on datasets like CaVI and WebCaricature.

Face recognition of realistic visual images has been well studied and made a significant progress in the recent decade. Unlike the realistic visual images, the face recognition of the caricatures is far from the performance of the visual images. This is largely due to the extreme non-rigid distortions of the caricatures introduced by exaggerating the facial features to strengthen the characters. The heterogeneous modalities of the caricatures and the visual images result the caricature-visual face recognition is a cross-modal problem. In this paper, we propose a method to conduct caricature-visual face recognition via multi-task learning. Rather than the conventional multi-task learning with fixed weights of tasks, this work proposes an approach to learn the weights of tasks according to the importance of tasks. The proposed multi-task learning with dynamic tasks weights enables to appropriately train the hard task and easy task instead of being stuck in the over-training easy task as conventional methods. The experimental results demonstrate the effectiveness of the proposed dynamic multi-task learning for cross-modal caricature-visual face recognition. The performances on the datasets CaVI and WebCaricature show the superiority over the state-of-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes