Knowledge Completion for Generics using Guided Tensor Factorization
This addresses the challenge of completing generics KBs, which are crucial for applications like question answering but are more incomplete and violate common assumptions, representing a domain-specific advancement.
The paper tackles the problem of inferring additional facts for generics knowledge bases (KBs) with quantification and complex regularities, achieving state-of-the-art results by doubling the size of two science KBs at 74%-86% precision and improving annotation efficiency for rare entities by 6x compared to baselines.
Given a knowledge base or KB containing (noisy) facts about common nouns or generics, such as "all trees produce oxygen" or "some animals live in forests", we consider the problem of inferring additional such facts at a precision similar to that of the starting KB. Such KBs capture general knowledge about the world, and are crucial for various applications such as question answering. Different from commonly studied named entity KBs such as Freebase, generics KBs involve quantification, have more complex underlying regularities, tend to be more incomplete, and violate the commonly used locally closed world assumption (LCWA). We show that existing KB completion methods struggle with this new task, and present the first approach that is successful. Our results demonstrate that external information, such as relation schemas and entity taxonomies, if used appropriately, can be a surprisingly powerful tool in this setting. First, our simple yet effective knowledge guided tensor factorization approach achieves state-of-the-art results on two generics KBs (80% precise) for science, doubling their size at 74%-86% precision. Second, our novel taxonomy guided, submodular, active learning method for collecting annotations about rare entities (e.g., oriole, a bird) is 6x more effective at inferring further new facts about them than multiple active learning baselines.