ML LGDec 5, 2018

Adapting Auxiliary Losses Using Gradient Similarity

Yunshu Du, Wojciech M. Czarnecki, Siddhant M. Jayakumar, Mehrdad Farajtabar, Razvan Pascanu, Balaji Lakshminarayanan

arXiv:1812.02224v229.9180 citations

Originality Incremental advance

AI Analysis

This work addresses a specific bottleneck in multi-task learning for researchers and practitioners, offering an incremental improvement over existing methods.

The paper tackled the problem of determining when auxiliary tasks help or hurt neural network training by proposing an adaptive weighting method based on gradient cosine similarity, demonstrating convergence guarantees and practical improvements in multi-task supervised learning and reinforcement learning domains.

One approach to deal with the statistical inefficiency of neural networks is to rely on auxiliary losses that help to build useful representations. However, it is not always trivial to know if an auxiliary task will be helpful for the main task and when it could start hurting. We propose to use the cosine similarity between gradients of tasks as an adaptive weight to detect when an auxiliary loss is helpful to the main loss. We show that our approach is guaranteed to converge to critical points of the main task and demonstrate the practical usefulness of the proposed algorithm in a few domains: multi-task supervised learning on subsets of ImageNet, reinforcement learning on gridworld, and reinforcement learning on Atari games.

View on arXiv PDF

Similar