LG CLOct 22, 2020

AdapterDrop: On the Efficiency of Adapters in Transformers

Andreas Rücklé, Gregor Geigle, Max Glockner, Tilman Beck, Jonas Pfeiffer, Nils Reimers, Iryna Gurevych

arXiv:2010.11918v250.7708 citations

Originality Incremental advance

AI Analysis

This work addresses efficiency issues for users of large transformer models, but it is incremental as it builds on existing adapter-based methods.

The paper tackles the computational inefficiency of fine-tuning and inference in large pre-trained transformers by proposing AdapterDrop, which removes adapters from lower layers during training and inference, resulting in dynamically reduced computational overhead with minimal performance decrease.

Massively pre-trained transformer models are computationally expensive to fine-tune, slow for inference, and have large storage requirements. Recent approaches tackle these shortcomings by training smaller models, dynamically reducing the model size, and by training light-weight adapters. In this paper, we propose AdapterDrop, removing adapters from lower transformer layers during training and inference, which incorporates concepts from all three directions. We show that AdapterDrop can dynamically reduce the computational overhead when performing inference over multiple tasks simultaneously, with minimal decrease in task performances. We further prune adapters from AdapterFusion, which improves the inference efficiency while maintaining the task performances entirely.

View on arXiv PDF

Similar