LGCLOct 22, 2020

AdapterDrop: On the Efficiency of Adapters in Transformers

arXiv:2010.11918v2708 citations
Originality Incremental advance
AI Analysis

This work addresses efficiency issues for users of large transformer models, but it is incremental as it builds on existing adapter-based methods.

The paper tackles the computational inefficiency of fine-tuning and inference in large pre-trained transformers by proposing AdapterDrop, which removes adapters from lower layers during training and inference, resulting in dynamically reduced computational overhead with minimal performance decrease.

Massively pre-trained transformer models are computationally expensive to fine-tune, slow for inference, and have large storage requirements. Recent approaches tackle these shortcomings by training smaller models, dynamically reducing the model size, and by training light-weight adapters. In this paper, we propose AdapterDrop, removing adapters from lower transformer layers during training and inference, which incorporates concepts from all three directions. We show that AdapterDrop can dynamically reduce the computational overhead when performing inference over multiple tasks simultaneously, with minimal decrease in task performances. We further prune adapters from AdapterFusion, which improves the inference efficiency while maintaining the task performances entirely.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes