CLMay 27, 2025

A Representation Level Analysis of NMT Model Robustness to Grammatical Errors

arXiv:2505.21224v11 citationsh-index: 20ACL
Originality Incremental advance
AI Analysis

This work provides insights into model robustness for machine translation, addressing reliability issues in NLP systems, though it is incremental as it builds on existing robustness studies.

The paper investigates how neural machine translation models handle grammatical errors by analyzing internal representations and attention mechanisms, finding that encoders detect and correct errors by shifting representations toward correct forms and identifying specific 'Robustness Heads' that attend to linguistic units.

Understanding robustness is essential for building reliable NLP systems. Unfortunately, in the context of machine translation, previous work mainly focused on documenting robustness failures or improving robustness. In contrast, we study robustness from a model representation perspective by looking at internal model representations of ungrammatical inputs and how they evolve through model layers. For this purpose, we perform Grammatical Error Detection (GED) probing and representational similarity analysis. Our findings indicate that the encoder first detects the grammatical error, then corrects it by moving its representation toward the correct form. To understand what contributes to this process, we turn to the attention mechanism where we identify what we term Robustness Heads. We find that Robustness Heads attend to interpretable linguistic units when responding to grammatical errors, and that when we fine-tune models for robustness, they tend to rely more on Robustness Heads for updating the ungrammatical word representation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes