On the Curious Case of $\ell_2$ norm of Sense Embeddings
This is an incremental finding that extends known relationships from word embeddings to sense embeddings, potentially aiding researchers in natural language processing.
The paper tackles the problem of understanding sense embeddings by showing that the ℓ₂ norm encodes sense frequency information from the training corpus, and demonstrates that this simple feature improves performance on word sense tasks like WiC and WSD.
We show that the $\ell_2$ norm of a static sense embedding encodes information related to the frequency of that sense in the training corpus used to learn the sense embeddings. This finding can be seen as an extension of a previously known relationship for word embeddings to sense embeddings. Our experimental results show that, in spite of its simplicity, the $\ell_2$ norm of sense embeddings is a surprisingly effective feature for several word sense related tasks such as (a) most frequent sense prediction, (b) Word-in-Context (WiC), and (c) Word Sense Disambiguation (WSD). In particular, by simply including the $\ell_2$ norm of a sense embedding as a feature in a classifier, we show that we can improve WiC and WSD methods that use static sense embeddings.