Routing Networks and the Challenges of Modular and Compositional Computation
This addresses a key problem in modular AI for researchers, but it is incremental as it focuses on analyzing existing challenges rather than introducing a new solution.
The paper tackles the training challenges in compositional models where module parameters and their composition must be learned jointly, analyzing issues in routing networks and empirically examining design decisions like composition decisions, module updates, and regularization.
Compositionality is a key strategy for addressing combinatorial complexity and the curse of dimensionality. Recent work has shown that compositional solutions can be learned and offer substantial gains across a variety of domains, including multi-task learning, language modeling, visual question answering, machine comprehension, and others. However, such models present unique challenges during training when both the module parameters and their composition must be learned jointly. In this paper, we identify several of these issues and analyze their underlying causes. Our discussion focuses on routing networks, a general approach to this problem, and examines empirically the interplay of these challenges and a variety of design decisions. In particular, we consider the effect of how the algorithm decides on module composition, how the algorithm updates the modules, and if the algorithm uses regularization.