CLFeb 28, 2025

Better Benchmarking LLMs for Zero-Shot Dependency Parsing

Ana Ezquerro, Carlos Gómez-Rodríguez, David Vilares

arXiv:2502.20866v117.012 citationsh-index: 4NoDaLiDa/Baltic-HLT

Originality Synthesis-oriented

AI Analysis

This reveals limitations in current open LLMs for accurate zero-shot syntactic parsing, which is important for NLP researchers and practitioners relying on these models for linguistic tasks.

The paper investigated whether state-of-the-art open-weight LLMs can outperform uninformed baselines in zero-shot dependency parsing, finding that most LLMs failed to beat the best baselines, with only the newest/largest LLaMA versions achieving modest gains across languages.

While LLMs excel in zero-shot tasks, their performance in linguistic challenges like syntactic parsing has been less scrutinized. This paper studies state-of-the-art open-weight LLMs on the task by comparing them to baselines that do not have access to the input sentence, including baselines that have not been used in this context such as random projective trees or optimal linear arrangements. The results show that most of the tested LLMs cannot outperform the best uninformed baselines, with only the newest and largest versions of LLaMA doing so for most languages, and still achieving rather low performance. Thus, accurate zero-shot syntactic parsing is not forthcoming with open LLMs.

View on arXiv PDF

Similar