Decomposing Generalization: Models of Generic, Habitual, and Episodic Statements
This work addresses a specific problem in computational linguistics for researchers, but it is incremental as it builds on existing methods and datasets.
The authors tackled the problem of modeling linguistic expressions of generalization by developing a semantic framework and dataset, and found that contextual word embeddings like ELMo were most effective for prediction tasks, with specific accuracy improvements noted.
We present a novel semantic framework for modeling linguistic expressions of generalization---generic, habitual, and episodic statements---as combinations of simple, real-valued referential properties of predicates and their arguments. We use this framework to construct a dataset covering the entirety of the Universal Dependencies English Web Treebank. We use this dataset to probe the efficacy of type-level and token-level information---including hand-engineered features and static (GloVe) and contextual (ELMo) word embeddings---for predicting expressions of generalization. Data and code are available at decomp.io.