Consistency Regularization for Cross-Lingual Fine-Tuning
This work addresses the challenge of transferring task-specific supervision across languages for NLP practitioners, representing an incremental improvement over existing fine-tuning methods.
The paper tackled the problem of improving cross-lingual fine-tuning by proposing consistency regularization with data augmentations, resulting in significant performance gains on the XTREME benchmark across tasks like text classification, question answering, and sequence labeling.
Fine-tuning pre-trained cross-lingual language models can transfer task-specific supervision from one language to the others. In this work, we propose to improve cross-lingual fine-tuning with consistency regularization. Specifically, we use example consistency regularization to penalize the prediction sensitivity to four types of data augmentations, i.e., subword sampling, Gaussian noise, code-switch substitution, and machine translation. In addition, we employ model consistency to regularize the models trained with two augmented versions of the same training set. Experimental results on the XTREME benchmark show that our method significantly improves cross-lingual fine-tuning across various tasks, including text classification, question answering, and sequence labeling.