CLSep 27, 2021

Language Invariant Properties in Natural Language Processing

Federico Bianchi, Debora Nozza, Dirk Hovy

arXiv:2109.13037v230.3640 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses the need for quantitative evaluation of transformation robustness in NLP, with implications for social factors and language pragmatics, though it is incremental in applying existing concepts to new contexts.

The paper tackles the problem of evaluating the robustness of NLP transformation algorithms by introducing language invariant properties, which should remain unchanged under transformations like translation or paraphrasing, and finds that many transformations alter properties such as author characteristics, making them sound more male.

Meaning is context-dependent, but many properties of language (should) remain the same even if we transform the context. For example, sentiment, entailment, or speaker properties should be the same in a translation and original of a text. We introduce language invariant properties: i.e., properties that should not change when we transform text, and how they can be used to quantitatively evaluate the robustness of transformation algorithms. We use translation and paraphrasing as transformation examples, but our findings apply more broadly to any transformation. Our results indicate that many NLP transformations change properties like author characteristics, i.e., make them sound more male. We believe that studying these properties will allow NLP to address both social factors and pragmatic aspects of language. We also release an application suite that can be used to evaluate the invariance of transformation applications.

View on arXiv PDF Code

Similar