CLMar 28, 2024

Ungrammatical-syntax-based In-context Example Selection for Grammatical Error Correction

arXiv:2403.19283v130 citationsh-index: 5Has CodeNAACL
Originality Incremental advance
AI Analysis

This work addresses grammatical error correction for language processing applications, representing an incremental improvement in example selection for in-context learning.

The paper tackles the challenge of applying large language models to grammatical error correction by proposing a novel in-context example selection strategy based on ungrammatical syntax, which outperforms word-matching or semantics-based methods on benchmark datasets.

In the era of large language models (LLMs), in-context learning (ICL) stands out as an effective prompting strategy that explores LLMs' potency across various tasks. However, applying LLMs to grammatical error correction (GEC) is still a challenging task. In this paper, we propose a novel ungrammatical-syntax-based in-context example selection strategy for GEC. Specifically, we measure similarity of sentences based on their syntactic structures with diverse algorithms, and identify optimal ICL examples sharing the most similar ill-formed syntax to the test input. Additionally, we carry out a two-stage process to further improve the quality of selection results. On benchmark English GEC datasets, empirical results show that our proposed ungrammatical-syntax-based strategies outperform commonly-used word-matching or semantics-based methods with multiple LLMs. This indicates that for a syntax-oriented task like GEC, paying more attention to syntactic information can effectively boost LLMs' performance. Our code will be publicly available after the publication of this paper.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes