SwiDeN : Convolutional Neural Networks For Depiction Invariant Object Recognition
This addresses the limitation of specialized object recognition systems for a single depictive style, enabling more versatile applications, though it appears incremental as it builds on existing CNN frameworks.
The paper tackles the problem of object recognition across different visual depiction styles (e.g., photos, sketches) by proposing SwiDeN, a CNN architecture with a depictive style-based switching mechanism, and shows it outperforms other approaches on a 50-category Photo-Art dataset.
Current state of the art object recognition architectures achieve impressive performance but are typically specialized for a single depictive style (e.g. photos only, sketches only). In this paper, we present SwiDeN : our Convolutional Neural Network (CNN) architecture which recognizes objects regardless of how they are visually depicted (line drawing, realistic shaded drawing, photograph etc.). In SwiDeN, we utilize a novel `deep' depictive style-based switching mechanism which appropriately addresses the depiction-specific and depiction-invariant aspects of the problem. We compare SwiDeN with alternative architectures and prior work on a 50-category Photo-Art dataset containing objects depicted in multiple styles. Experimental results show that SwiDeN outperforms other approaches for the depiction-invariant object recognition problem.