Automatic Attribute Discovery with Neural Activations
This addresses the challenge of attribute recognition in noisy, real-world data for computer vision applications, though it appears incremental as it builds on existing neural network methods.
The paper tackles the problem of automatically discovering visual attributes from noisy web image-text data without supervised datasets, showing that neural activations can be used to learn classifiers that align with human perception and provide insights into perceptual depth.
How can a machine learn to recognize visual attributes emerging out of online community without a definitive supervised dataset? This paper proposes an automatic approach to discover and analyze visual attributes from a noisy collection of image-text data on the Web. Our approach is based on the relationship between attributes and neural activations in the deep network. We characterize the visual property of the attribute word as a divergence within weakly-annotated set of images. We show that the neural activations are useful for discovering and learning a classifier that well agrees with human perception from the noisy real-world Web data. The empirical study suggests the layered structure of the deep neural networks also gives us insights into the perceptual depth of the given word. Finally, we demonstrate that we can utilize highly-activating neurons for finding semantically relevant regions.