Towards a Seamless Integration of Word Senses into Downstream NLP Applications
This addresses the understudied integration of sense-level information for improving downstream NLP applications like topic categorization and polarity detection, though it appears incremental as it builds on existing models.
The paper tackles the problem of lexical ambiguity in NLP systems by integrating sense-level information through a novel disambiguation algorithm, showing consistent performance improvements on topic categorization and polarity detection datasets, particularly with reduced sense granularity and larger documents.
Lexical ambiguity can impede NLP systems from accurate understanding of semantics. Despite its potential benefits, the integration of sense-level information into NLP systems has remained understudied. By incorporating a novel disambiguation algorithm into a state-of-the-art classification model, we create a pipeline to integrate sense-level information into downstream NLP applications. We show that a simple disambiguation of the input text can lead to consistent performance improvement on multiple topic categorization and polarity detection datasets, particularly when the fine granularity of the underlying sense inventory is reduced and the document is sufficiently large. Our results also point to the need for sense representation research to focus more on in vivo evaluations which target the performance in downstream NLP applications rather than artificial benchmarks.