CLSep 10, 2021

Studying word order through iterative shuffling

Nikolay Malkin, Sameera Lanka, Pranav Goel, Nebojsa Jojic

arXiv:2109.04867v130.8661 citationsHas Code

Originality Incremental advance

AI Analysis

This challenges assumptions about syntax in neural language models, with implications for language modeling and constrained generation, though it is incremental in scope.

The paper refutes the hypothesis that word order is essential for meaning in many NLP tasks, showing that sentences in GLUE and various English texts can rarely be permuted to change meaning substantially, using a novel iterative shuffling method.

As neural language models approach human performance on NLP benchmark tasks, their advances are widely seen as evidence of an increasingly complex understanding of syntax. This view rests upon a hypothesis that has not yet been empirically tested: that word order encodes meaning essential to performing these tasks. We refute this hypothesis in many cases: in the GLUE suite and in various genres of English text, the words in a sentence or phrase can rarely be permuted to form a phrase carrying substantially different information. Our surprising result relies on inference by iterative shuffling (IBIS), a novel, efficient procedure that finds the ordering of a bag of words having the highest likelihood under a fixed language model. IBIS can use any black-box model without additional training and is superior to existing word ordering algorithms. Coalescing our findings, we discuss how shuffling inference procedures such as IBIS can benefit language modeling and constrained generation.

View on arXiv PDF Code

Similar