BAM! The Behance Artistic Media Dataset for Recognition Beyond Photography
This dataset addresses the problem of limited machine recognition capabilities for non-photographic artwork, primarily for researchers in computer vision and AI, but it is incremental as it builds on existing data collection efforts.
The authors tackled the challenge of recognizing and categorizing artistic images that differ from everyday photography by creating the Behance Artistic Media dataset, which includes large-scale annotated artwork with baseline experiments showing its value for style prediction, object classifier improvement, and domain adaptation.
Computer vision systems are designed to work well within the context of everyday photography. However, artists often render the world around them in ways that do not resemble photographs. Artwork produced by people is not constrained to mimic the physical world, making it more challenging for machines to recognize. This work is a step toward teaching machines how to categorize images in ways that are valuable to humans. First, we collect a large-scale dataset of contemporary artwork from Behance, a website containing millions of portfolios from professional and commercial artists. We annotate Behance imagery with rich attribute labels for content, emotions, and artistic media. Furthermore, we carry out baseline experiments to show the value of this dataset for artistic style prediction, for improving the generality of existing object classifiers, and for the study of visual domain adaptation. We believe our Behance Artistic Media dataset will be a good starting point for researchers wishing to study artistic imagery and relevant problems.