Optimizing Language Models for Grammatical Acceptability: A Comparative Study of Fine-Tuning Techniques
This work addresses computational barriers for democratizing access to large language models, but it is incremental as it compares existing fine-tuning techniques on a standard dataset.
This study tackled the problem of fine-tuning language models for grammatical acceptability tasks, finding that while Vanilla-Fine-Tuning achieved the highest accuracy at 81.2%, LoRA improved computational efficiency by reducing memory usage and iteration time by over 50%.
This study explores the fine-tuning (FT) of the Open Pre-trained Transformer (OPT-125M) for grammatical acceptability tasks using the CoLA dataset. By comparing Vanilla-Fine-Tuning (VFT), Pattern-Based-Fine-Tuning (PBFT), and Parameter-Efficient Fine-Tuning techniques (PEFT) like Low-Rank Adaptation (LoRA), we demonstrate significant improvements in computational efficiency while maintaining high accuracy. Our experiments reveal that while VFT achieves the highest accuracy (81.2%), LoRA enhancing FT by reducing memory usage and iteration time by more than 50%, and increases accuracy in PBFT case. Context Distillation (CD), though computationally efficient, underperformed with accuracy around 31%. Our findings contribute to democratizing access to large language models (LLM) by reducing computational barriers.