CL IR LGDec 29, 2022

Maximizing Use-Case Specificity through Precision Model Tuning

Pranjali Awasthi, David Recio-Mitter, Yosuke Kyle Sugi

arXiv:2212.14206v11 citationsh-index: 32

Originality Synthesis-oriented

AI Analysis

This addresses the problem of optimizing language models for domain-specific information retrieval, though it is incremental as it applies existing fine-tuning methods to a new biomedical dataset.

The paper analyzed four transformer-based language models for biomedical information retrieval on protein structure/function prediction, finding that smaller models (<10B parameters) fine-tuned on domain-specific data outperformed larger models by an average of +50% in accuracy, relevance, and interpretability for specific questions.

Language models have become increasingly popular in recent years for tasks like information retrieval. As use-cases become oriented toward specific domains, fine-tuning becomes default for standard performance. To fine-tune these models for specific tasks and datasets, it is necessary to carefully tune the model's hyperparameters and training techniques. In this paper, we present an in-depth analysis of the performance of four transformer-based language models on the task of biomedical information retrieval. The models we consider are DeepMind's RETRO (7B parameters), GPT-J (6B parameters), GPT-3 (175B parameters), and BLOOM (176B parameters). We compare their performance on the basis of relevance, accuracy, and interpretability, using a large corpus of 480000 research papers on protein structure/function prediction as our dataset. Our findings suggest that smaller models, with <10B parameters and fine-tuned on domain-specific datasets, tend to outperform larger language models on highly specific questions in terms of accuracy, relevancy, and interpretability by a significant margin (+50% on average). However, larger models do provide generally better results on broader prompts.

View on arXiv PDF

Similar