LGMar 3, 2021

RotoGrad: Gradient Homogenization in Multitask Learning

arXiv:2103.02631v3117 citationsHas Code
AI Analysis

This addresses negative transfer in multitask learning for applications like computer vision and reinforcement learning, representing an incremental improvement over prior methods.

The paper tackles the problem of negative transfer in multitask learning by addressing disparities in gradient magnitudes and directions across tasks, introducing RotoGrad which jointly homogenizes both aspects and outperforms competing methods on datasets like CelebA and NYUv2.

Multitask learning is being increasingly adopted in applications domains like computer vision and reinforcement learning. However, optimally exploiting its advantages remains a major challenge due to the effect of negative transfer. Previous works have tracked down this issue to the disparities in gradient magnitudes and directions across tasks, when optimizing the shared network parameters. While recent work has acknowledged that negative transfer is a two-fold problem, existing approaches fall short as they only focus on either homogenizing the gradient magnitude across tasks; or greedily change the gradient directions, overlooking future conflicts. In this work, we introduce RotoGrad, an algorithm that tackles negative transfer as a whole: it jointly homogenizes gradient magnitudes and directions, while ensuring training convergence. We show that RotoGrad outperforms competing methods in complex problems, including multi-label classification in CelebA and computer vision tasks in the NYUv2 dataset. A Pytorch implementation can be found in https://github.com/adrianjav/rotograd.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes