Exploring sentence informativeness
This is an incremental exploration for NLP researchers working with scarce data to enhance word embeddings.
The study tackled the problem of defining and measuring sentence informativeness for word representation, finding that proposed classifiers and manual annotation capture different notions, but using classifier predictions improves embedding quality.
This study is a preliminary exploration of the concept of informativeness -how much information a sentence gives about a word it contains- and its potential benefits to building quality word representations from scarce data. We propose several sentence-level classifiers to predict informativeness, and we perform a manual annotation on a set of sentences. We conclude that these two measures correspond to different notions of informativeness. However, our experiments show that using the classifiers' predictions to train word embeddings has an impact on embedding quality.