LGCVMLFeb 9, 2020

Multi-Task Learning by a Top-Down Control Network

arXiv:2002.03335v37 citations
AI Analysis

This addresses the challenge of multi-task learning in vision systems, offering improved performance and scalability for applications requiring diverse tasks.

The paper tackles the problem of executing multiple vision tasks accurately and efficiently in a single network by introducing a top-down control network that modifies activations based on task, image content, and location, achieving significantly better results than state-of-the-art methods on four datasets.

As the range of tasks performed by a general vision system expands, executing multiple tasks accurately and efficiently in a single network has become an important and still open problem. Recent computer vision approaches address this problem by branching networks, or by a channel-wise modulation of the network feature-maps with task specific vectors. We present a novel architecture that uses a dedicated top-down control network to modify the activation of all the units in the main recognition network in a manner that depends on the selected task, image content, and spatial location. We show the effectiveness of our scheme by achieving significantly better results than alternative state-of-the-art approaches on four datasets. We further demonstrate our advantages in terms of task selectivity, scaling the number of tasks and interpretability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes