LGCVJul 23, 2021

Compositional Models: Multi-Task Learning and Knowledge Transfer with Modular Networks

arXiv:2107.10963v13 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient knowledge transfer and modularity in neural networks for researchers and practitioners in machine learning, though it is incremental as it builds on existing modular and ResNet concepts.

The paper tackles the problem of multi-task learning and knowledge transfer by proposing a modular network approach based on isometric ResNet blocks, enabling reusable computational blocks and adjustable computation order. It achieves competitive results in multi-task learning, transfer learning, and domain adaptation, with applications like improving ImageNet accuracy without parameter increase.

Conditional computation and modular networks have been recently proposed for multitask learning and other problems as a way to decompose problem solving into multiple reusable computational blocks. We propose a new approach for learning modular networks based on the isometric version of ResNet with all residual blocks having the same configuration and the same number of parameters. This architectural choice allows adding, removing and changing the order of residual blocks. In our method, the modules can be invoked repeatedly and allow knowledge transfer to novel tasks by adjusting the order of computation. This allows soft weight sharing between tasks with only a small increase in the number of parameters. We show that our method leads to interpretable self-organization of modules in case of multi-task learning, transfer learning and domain adaptation while achieving competitive results on those tasks. From practical perspective, our approach allows to: (a) reuse existing modules for learning new task by adjusting the computation order, (b) use it for unsupervised multi-source domain adaptation to illustrate that adaptation to unseen data can be achieved by only manipulating the order of pretrained modules, (c) show how our approach can be used to increase accuracy of existing architectures for image classification tasks such as ImageNet, without any parameter increase, by reusing the same block multiple times.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes