CVApr 6, 2022

Universal Representations: A Unified Look at Multiple Task and Domain Learning

arXiv:2204.02744v238 citationsh-index: 32
Originality Incremental advance
AI Analysis

This addresses the challenge of multi-task and multi-domain learning in computer vision, offering a unified approach that is incremental but improves efficiency and performance.

The paper tackles the problem of unbalanced optimization when jointly learning multiple vision tasks and domains, proposing universal representations that distill knowledge from task/domain-specific networks into a single model. The result is state-of-the-art performance on datasets like NYU-v2, Cityscapes, Visual Decathlon, and MetaDataset.

We propose a unified look at jointly learning multiple vision tasks and visual domains through universal representations, a single deep neural network. Learning multiple problems simultaneously involves minimizing a weighted sum of multiple loss functions with different magnitudes and characteristics and thus results in unbalanced state of one loss dominating the optimization and poor results compared to learning a separate model for each problem. To this end, we propose distilling knowledge of multiple task/domain-specific networks into a single deep neural network after aligning its representations with the task/domain-specific ones through small capacity adapters. We rigorously show that universal representations achieve state-of-the-art performances in learning of multiple dense prediction problems in NYU-v2 and Cityscapes, multiple image classification problems from diverse domains in Visual Decathlon Dataset and cross-domain few-shot learning in MetaDataset. Finally we also conduct multiple analysis through ablation and qualitative studies.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes