Multi-Task Structural Learning using Local Task Similarity induced Neuron Creation and Removal
This addresses the challenge of task interference in multi-task learning for AI systems, though it appears incremental as it builds on existing neural network methods.
The paper tackles the problem of static architectures hindering multi-task learning by proposing Multi-Task Structural Learning (MTSL), which dynamically learns both architecture and parameters, achieving competitive generalization and improved robustness to out-of-distribution data.
Multi-task learning has the potential to improve generalization by maximizing positive transfer between tasks while reducing task interference. Fully achieving this potential is hindered by manually designed architectures that remain static throughout training. On the contrary, learning in the brain occurs through structural changes that are in tandem with changes in synaptic strength. Thus, we propose \textit{Multi-Task Structural Learning (MTSL)} that simultaneously learns the multi-task architecture and its parameters. MTSL begins with an identical single-task network for each task and alternates between a task-learning phase and a structural-learning phase. In the task learning phase, each network specializes in the corresponding task. In each of the structural learning phases, starting from the earliest layer, locally similar task layers first transfer their knowledge to a newly created group layer before being removed. MTSL then uses the group layer in place of the corresponding removed task layers and moves on to the next layers. Our empirical results show that MTSL achieves competitive generalization with various baselines and improves robustness to out-of-distribution data.