Cascaded Multi-task Adaptive Learning Based on Neural Architecture Search
This addresses parameter and memory inefficiency in optimizing cascaded models for multi-task learning, though it is incremental as it builds on existing NAS and adapter techniques.
The paper tackles the inefficiency of fine-tuning cascaded multi-task models by proposing a Neural Architecture Search (NAS) method to automatically select adaptive operations (frozen, adapter, fine-tuning) for each module, achieving 8.7% of the parameters of full fine-tuning with better performance on the SLURP dataset.
Cascading multiple pre-trained models is an effective way to compose an end-to-end system. However, fine-tuning the full cascaded model is parameter and memory inefficient and our observations reveal that only applying adapter modules on cascaded model can not achieve considerable performance as fine-tuning. We propose an automatic and effective adaptive learning method to optimize end-to-end cascaded multi-task models based on Neural Architecture Search (NAS) framework. The candidate adaptive operations on each specific module consist of frozen, inserting an adapter and fine-tuning. We further add a penalty item on the loss to limit the learned structure which takes the amount of trainable parameters into account. The penalty item successfully restrict the searched architecture and the proposed approach is able to search similar tuning scheme with hand-craft, compressing the optimizing parameters to 8.7% corresponding to full fine-tuning on SLURP with an even better performance.