One Representation per Word - Does it make Sense for Composition?
This addresses the problem of efficient natural language processing for researchers and practitioners by showing incremental improvements in composition methods.
The paper investigates whether word sense disambiguation is necessary before composition, finding that single-sense vector models perform as well or better than multi-sense models on phrase similarity and word-sense discrimination tasks, with simple composition functions effectively recovering sense-specific information.
In this paper, we investigate whether an a priori disambiguation of word senses is strictly necessary or whether the meaning of a word in context can be disambiguated through composition alone. We evaluate the performance of off-the-shelf single-vector and multi-sense vector models on a benchmark phrase similarity task and a novel task for word-sense discrimination. We find that single-sense vector models perform as well or better than multi-sense vector models despite arguably less clean elementary representations. Our findings furthermore show that simple composition functions such as pointwise addition are able to recover sense specific information from a single-sense vector model remarkably well.