Important Attribute Identification in Knowledge Graph
This addresses the challenge of determining attribute importance in knowledge graphs for applications like information retrieval and natural language generation, representing an incremental improvement with a novel method for a known bottleneck.
The paper tackles the problem of identifying important attributes in knowledge graphs where entities have many attributes, proposing a method that uses external user-generated text data with word/sub-word embeddings to match and rank attributes by matching cohesiveness, achieving better performance than previous traditional methods.
The knowledge graph(KG) composed of entities with their descriptions and attributes, and relationship between entities, is finding more and more application scenarios in various natural language processing tasks. In a typical knowledge graph like Wikidata, entities usually have a large number of attributes, but it is difficult to know which ones are important. The importance of attributes can be a valuable piece of information in various applications spanning from information retrieval to natural language generation. In this paper, we propose a general method of using external user generated text data to evaluate the relative importance of an entity's attributes. To be more specific, we use the word/sub-word embedding techniques to match the external textual data back to entities' attribute name and values and rank the attributes by their matching cohesiveness. To our best knowledge, this is the first work of applying vector based semantic matching to important attribute identification, and our method outperforms the previous traditional methods. We also apply the outcome of the detected important attributes to a language generation task; compared with previous generated text, the new method generates much more customized and informative messages.