Exploring Semantic Capacity of Terms
This work addresses a fundamental issue in natural language processing by introducing semantic capacity, which could aid downstream tasks, but it appears incremental as it builds on existing co-occurrence methods.
The authors tackled the problem of quantifying the semantic capacity of terms, proposing a two-step model that evaluates term capacity from large text corpora and demonstrated its effectiveness across three fields with comparisons to baselines and human evaluations.
We introduce and study semantic capacity of terms. For example, the semantic capacity of artificial intelligence is higher than that of linear regression since artificial intelligence possesses a broader meaning scope. Understanding semantic capacity of terms will help many downstream tasks in natural language processing. For this purpose, we propose a two-step model to investigate semantic capacity of terms, which takes a large text corpus as input and can evaluate semantic capacity of terms if the text corpus can provide enough co-occurrence information of terms. Extensive experiments in three fields demonstrate the effectiveness and rationality of our model compared with well-designed baselines and human-level evaluations.