Information Propagation by Composited Labels in Natural Language Processing
This work addresses a foundational issue in NLP for researchers by proposing a novel theoretical framework, though it appears incremental as it builds on existing labeling concepts.
The paper tackles the problem of modeling information propagation in NLP by defining labels as maps between entity mentions and their broader contexts, enabling the construction of entity graphs and calculation of information loss via entropy-based distances.
In natural language processing (NLP), labeling on regions of text, such as words, sentences and paragraphs, is a basic task. In this paper, label is defined as map between mention of entity in a region on text and context of entity in a broader region on text containing the mention. This definition naturally introduces linkage of entities induced from inclusion relation of regions, and connected entities form a graph representing information flow defined by map. It also enables calculation of information loss through map using entropy, and entropy lost is regarded as distance between two entities over a path on graph.