CLFeb 16, 2024

Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning

arXiv:2402.10639v214.629 citationsh-index: 2ACL

Originality Synthesis-oriented

AI Analysis

This work addresses the reliability of parameter-efficient fine-tuning methods for adapting models to new domains, but it is incremental as it builds on existing adapter mixing techniques.

The study investigates the generalizability of mixing domain-specific adapters for pre-trained language models on unseen in-domain examples, finding a negative correlation between weight sign differences and mixture performance.

Several parameter-efficient fine-tuning methods based on adapters have been proposed as a streamlined approach to incorporate not only a single specialized knowledge into existing Pre-Trained Language Models (PLMs) but also multiple of them at once. Recent works such as AdapterSoup propose to mix not all but only a selective sub-set of domain-specific adapters during inference via model weight averaging to optimize performance on novel, unseen domains with excellent computational efficiency. However, the essential generalizability of this emerging weight-space adapter mixing mechanism on \textit{unseen, in-domain examples} remains unexplored. Thus, in this study, we conduct a comprehensive analysis to elucidate the generalizability of domain-specific adapter mixtures in in-domain evaluation. We also provide investigations into the inner workings of the mixture of domain-specific adapters by analyzing their weight signs, yielding critical analysis on the negative correlation between their fraction of weight sign difference and their mixtures' generalizability.

View on arXiv PDF

Similar