CLJun 13, 2024

ProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy Models

arXiv:2406.09334v318 citations
Originality Incremental advance
AI Analysis

This addresses the computational burden for researchers and practitioners evaluating language models on diverse tasks and languages, though it is incremental as it builds on existing performance prediction methods.

The paper tackles the problem of predicting language model performance on multilingual tasks to reduce computational costs, achieving up to a 37.08x speedup and outperforming state-of-the-art methods by at least 1.78x in RMSE.

Performance prediction is a method to estimate the performance of Language Models (LMs) on various Natural Language Processing (NLP) tasks, mitigating computational costs associated with model capacity and data for fine-tuning. Our paper presents ProxyLM, a scalable task- and language-agnostic framework designed to predict the performance of LMs using proxy models. These proxy models act as surrogates, approximating the performance of the LM of interest. By leveraging these proxy models, ProxyLM significantly reduces computational overhead in task evaluations, achieving up to a 37.08x speedup over traditional methods, even with our smallest proxy models. Our results across multiple multilingual NLP tasks and various robustness tests demonstrate that ProxyLM not only adapts well to previously unseen languages in pre-trained LMs, but also generalizes effectively across different datasets, outperforming the state-of-the-art by at least 1.78x in terms of root-mean-square error (RMSE).

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes