CVApr 5, 2019

Branched Multi-Task Networks: Deciding What Layers To Share

arXiv:1904.02920v5161 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of efficiently optimizing multi-task learning architectures for practitioners, though it is incremental as it builds on prior branched network methods.

The paper tackles the problem of automatically designing branched multi-task neural networks by leveraging task affinities to decide which layers to share, resulting in architectures that achieve the highest performance for a given parameter budget or require the least parameters for a performance threshold across diverse datasets.

In the context of multi-task learning, neural networks with branched architectures have often been employed to jointly tackle the tasks at hand. Such ramified networks typically start with a number of shared layers, after which different tasks branch out into their own sequence of layers. Understandably, as the number of possible network configurations is combinatorially large, deciding what layers to share and where to branch out becomes cumbersome. Prior works have either relied on ad hoc methods to determine the level of layer sharing, which is suboptimal, or utilized neural architecture search techniques to establish the network design, which is considerably expensive. In this paper, we go beyond these limitations and propose an approach to automatically construct branched multi-task networks, by leveraging the employed tasks' affinities. Given a specific budget, i.e. number of learnable parameters, the proposed approach generates architectures, in which shallow layers are task-agnostic, whereas deeper ones gradually grow more task-specific. Extensive experimental analysis across numerous, diverse multi-tasking datasets shows that, for a given budget, our method consistently yields networks with the highest performance, while for a certain performance threshold it requires the least amount of learnable parameters.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes