Limited-Resource Adapters Are Regularizers, Not Linguists
This work addresses low-resource language technology by revealing that adapter methods may not leverage linguistic knowledge as expected, which is incremental as it challenges prior assumptions about cross-lingual transfer.
The study tackled cross-lingual transfer for low-resource Creole languages using adapter souping and cross-attention fine-tuning, finding that adapters improve performance but act as regularizers rather than facilitating meaningful linguistic transfer, with no correlation between linguistic relatedness and adapter effectiveness.
Cross-lingual transfer from related high-resource languages is a well-established strategy to enhance low-resource language technologies. Prior work has shown that adapters show promise for, e.g., improving low-resource machine translation (MT). In this work, we investigate an adapter souping method combined with cross-attention fine-tuning of a pre-trained MT model to leverage language transfer for three low-resource Creole languages, which exhibit relatedness to different language groups across distinct linguistic dimensions. Our approach improves performance substantially over baselines. However, we find that linguistic relatedness -- or even a lack thereof -- does not covary meaningfully with adapter performance. Surprisingly, our cross-attention fine-tuning approach appears equally effective with randomly initialized adapters, implying that the benefit of adapters in this setting lies in parameter regularization, and not in meaningful information transfer. We provide analysis supporting this regularization hypothesis. Our findings underscore the reality that neural language processing involves many success factors, and that not all neural methods leverage linguistic knowledge in intuitive ways.