Exploring Shared Structures and Hierarchies for Multiple NLP Tasks
This addresses the problem of scalable multi-task learning for NLP researchers by automating architecture design, though it is incremental as it builds on existing neural architecture search methods.
The paper tackles the challenge of automatically designing shared neural architectures for multi-task learning in NLP, using reinforcement learning to search for optimal sharing schemes, and demonstrates benefits through experiments on text classification and sequence labeling tasks.
Designing shared neural architecture plays an important role in multi-task learning. The challenge is that finding an optimal sharing scheme heavily relies on the expert knowledge and is not scalable to a large number of diverse tasks. Inspired by the promising work of neural architecture search (NAS), we apply reinforcement learning to automatically find possible shared architecture for multi-task learning. Specifically, we use a controller to select from a set of shareable modules and assemble a task-specific architecture, and repeat the same procedure for other tasks. The controller is trained with reinforcement learning to maximize the expected accuracies for all tasks. We conduct extensive experiments on two types of tasks, text classification and sequence labeling, which demonstrate the benefits of our approach.