Fully-adaptive Feature Sharing in Multi-Task Networks with Applications in Person Attribute Classification
This addresses the tedious and error-prone process of manual architecture design in multi-task learning for researchers and practitioners in computer vision, specifically for person attribute classification, though it is incremental as it builds on existing multi-task learning concepts.
The paper tackles the problem of manually designing multi-task deep learning architectures by proposing a principled approach that starts with a thin network and dynamically widens it during training to group similar tasks together. The result is that the models are fast, compact, and can closely match or exceed state-of-the-art accuracy on person attributes classification tasks, as shown in extensive evaluations.
Multi-task learning aims to improve generalization performance of multiple prediction tasks by appropriately sharing relevant information across them. In the context of deep neural networks, this idea is often realized by hand-designed network architectures with layers that are shared across tasks and branches that encode task-specific features. However, the space of possible multi-task deep architectures is combinatorially large and often the final architecture is arrived at by manual exploration of this space subject to designer's bias, which can be both error-prone and tedious. In this work, we propose a principled approach for designing compact multi-task deep learning architectures. Our approach starts with a thin network and dynamically widens it in a greedy manner during training using a novel criterion that promotes grouping of similar tasks together. Our Extensive evaluation on person attributes classification tasks involving facial and clothing attributes suggests that the models produced by the proposed method are fast, compact and can closely match or exceed the state-of-the-art accuracy from strong baselines by much more expensive models.