Noun2Verb: Probabilistic frame semantics for word class conversion
This work addresses the gap in lexical creativity for NLP systems, enabling more human-like flexibility in word usage, though it is incremental as it builds on prior proposals about shared knowledge.
The paper tackled the problem of interpreting and generating novel denominal verb usages (e.g., 'to Google') in natural language processing by proposing a probabilistic frame semantics framework called Noun2Verb. It showed that this model better explains empirical data across multiple languages and historical contexts compared to state-of-the-art language models.
Humans can flexibly extend word usages across different grammatical classes, a phenomenon known as word class conversion. Noun-to-verb conversion, or denominal verb (e.g., to Google a cheap flight), is one of the most prevalent forms of word class conversion. However, existing natural language processing systems are impoverished in interpreting and generating novel denominal verb usages. Previous work has suggested that novel denominal verb usages are comprehensible if the listener can compute the intended meaning based on shared knowledge with the speaker. Here we explore a computational formalism for this proposal couched in frame semantics. We present a formal framework, Noun2Verb, that simulates the production and comprehension of novel denominal verb usages by modeling shared knowledge of speaker and listener in semantic frames. We evaluate an incremental set of probabilistic models that learn to interpret and generate novel denominal verb usages via paraphrasing. We show that a model where the speaker and listener cooperatively learn the joint distribution over semantic frame elements better explains the empirical denominal verb usages than state-of-the-art language models, evaluated against data from 1) contemporary English in both adult and child speech, 2) contemporary Mandarin Chinese, and 3) the historical development of English. Our work grounds word class conversion in probabilistic frame semantics and bridges the gap between natural language processing systems and humans in lexical creativity.