CLNov 9, 2023

Memorisation Cartography: Mapping out the Memorisation-Generalisation Continuum in Neural Machine Translation

Verna Dankers, Ivan Titov, Dieuwke Hupkes

arXiv:2311.05379v121.3133 citationsh-index: 28

Originality Incremental advance

AI Analysis

This work addresses the fundamental issue of data memorization in neural networks for machine translation researchers, but it is incremental as it builds on existing metrics to map and analyze data points.

The study tackled the problem of understanding how memorization and generalization vary across data points in neural machine translation, finding that surface-level characteristics and training signals predict memorization and that different subsets of the memorization-generalization map influence model performance.

When training a neural network, it will quickly memorise some source-target mappings from your dataset but never learn some others. Yet, memorisation is not easily expressed as a binary feature that is good or bad: individual datapoints lie on a memorisation-generalisation continuum. What determines a datapoint's position on that spectrum, and how does that spectrum influence neural models' performance? We address these two questions for neural machine translation (NMT) models. We use the counterfactual memorisation metric to (1) build a resource that places 5M NMT datapoints on a memorisation-generalisation map, (2) illustrate how the datapoints' surface-level characteristics and a models' per-datum training signals are predictive of memorisation in NMT, (3) and describe the influence that subsets of that map have on NMT systems' performance.

View on arXiv PDF

Similar