Deep Architectures for Face Attributes
This work addresses face attribute prediction for computer vision applications, but it is incremental as it builds on existing deep learning methods with a new dataset and fine-tuning strategies.
The authors tackled the problem of predicting face attributes like age, gender, ethnicity, and emotion by training a deep convolutional neural network for identity classification and fine-tuning it for attribute classification, achieving efficient performance with only 1 G flops for all predictions.
We train a deep convolutional neural network to perform identity classification using a new dataset of public figures annotated with age, gender, ethnicity and emotion labels, and then fine-tune it for attribute classification. An optimal sharing pattern of computational resources within this network is determined by experiment, requiring only 1 G flops to produce all predictions. Rather than fine-tune by relearning weights in one additional layer after the penultimate layer of the identity network, we try several different depths for each attribute. We find that prediction of age and emotion is improved by fine-tuning from earlier layers onward, presumably because deeper layers are progressively invariant to non-identity related changes in the input.