Domain Expansion: A Latent Space Construction Framework for Multi-Task Learning
This addresses the challenge of training single networks for multiple objectives in machine learning, offering a novel solution to prevent representation collapse, though it is incremental as it builds on existing multi-task learning methods.
The paper tackles the problem of latent representation collapse in multi-task learning, where conflicting gradients degrade shared representations, by introducing the Domain Expansion framework, which restructures the latent space using orthogonal pooling to assign each objective to mutually orthogonal subspaces, resulting in improved performance and an interpretable latent space across benchmarks like ShapeNet, MPIIGaze, and Rotated MNIST.
Training a single network with multiple objectives often leads to conflicting gradients that degrade shared representations, forcing them into a compromised state that is suboptimal for any single task--a problem we term latent representation collapse. We introduce Domain Expansion, a framework that prevents these conflicts by restructuring the latent space itself. Our framework uses a novel orthogonal pooling mechanism to construct a latent space where each objective is assigned to a mutually orthogonal subspace. We validate our approach across diverse benchmarks--including ShapeNet, MPIIGaze, and Rotated MNIST--on challenging multi-objective problems combining classification with pose and gaze estimation. Our experiments demonstrate that this structure not only prevents collapse but also yields an explicit, interpretable, and compositional latent space where concepts can be directly manipulated.