Data Augmentation for Personal Knowledge Base Population
This work addresses the incremental improvement of knowledge base population for personal applications, focusing on enhancing completeness, fairness, and diversity while handling data protection and privacy issues.
The paper tackles the problem of cold start knowledge base population (KBP) from unstructured documents, particularly in personal knowledge bases where low F1 scores and challenges like data protection and privacy are acute, and presents a system using rule-based annotators and a graph neural network for missing link prediction to populate a more complete, fair, and diverse knowledge base from the TACRED dataset.
Cold start knowledge base population (KBP) is the problem of populating a knowledge base from unstructured documents. While artificial neural networks have led to significant improvements in the different tasks that are part of KBP, the overall F1 of the end-to-end system remains quite low. This problem is more acute in personal knowledge bases, which present additional challenges with regard to data protection, fairness and privacy. In this work, we present a system that uses rule based annotators and a graph neural network for missing link prediction, to populate a more complete, fair and diverse knowledge base from the TACRED dataset.