CLJul 19, 2023

Improving the Reusability of Pre-trained Language Models in Real-world Applications

Somayeh Ghanbarzadeh, Hamid Palangi, Yan Huang, Radames Cruz Moreno, Hamed Khanpour

arXiv:2307.10457v30.5h-index: 44

Originality Incremental advance

AI Analysis

This addresses the reusability issue of PLMs for real-world applications where data distributions vary, though it appears incremental as it builds on existing fine-tuning methods.

The paper tackles the generalization problem of Pre-trained Language Models (PLMs) on Out-of-Distribution (OOD) data by proposing Mask-tuning, which integrates Masked Language Modeling into fine-tuning. Results show it surpasses state-of-the-art techniques, improving performance on both OOD and in-distribution datasets.

The reusability of state-of-the-art Pre-trained Language Models (PLMs) is often limited by their generalization problem, where their performance drastically decreases when evaluated on examples that differ from the training dataset, known as Out-of-Distribution (OOD)/unseen examples. This limitation arises from PLMs' reliance on spurious correlations, which work well for frequent example types but not for general examples. To address this issue, we propose a training approach called Mask-tuning, which integrates Masked Language Modeling (MLM) training objectives into the fine-tuning process to enhance PLMs' generalization. Comprehensive experiments demonstrate that Mask-tuning surpasses current state-of-the-art techniques and enhances PLMs' generalization on OOD datasets while improving their performance on in-distribution datasets. The findings suggest that Mask-tuning improves the reusability of PLMs on unseen data, making them more practical and effective for real-world applications.

View on arXiv PDF

Similar