CLAIIRJan 29, 2019

TiFi: Taxonomy Induction for Fictional Domains [Extended version]

arXiv:1901.10263v110 citations
Originality Incremental advance
AI Analysis

This addresses the need for structured knowledge in domains poorly covered by Wikipedia, such as fictional universes, but is incremental as it adapts existing taxonomy induction methods to a specific domain.

The paper tackles the problem of constructing taxonomies for fictional domains, such as Lord of the Rings or Greek Mythology, using noisy inputs from fan wikis, and shows that TiFi achieves very high precision and outperforms state-of-the-art baselines by a substantial margin.

Taxonomies are important building blocks of structured knowledge bases, and their construction from text sources and Wikipedia has received much attention. In this paper we focus on the construction of taxonomies for fictional domains, using noisy category systems from fan wikis or text extraction as input. Such fictional domains are archetypes of entity universes that are poorly covered by Wikipedia, such as also enterprise-specific knowledge bases or highly specialized verticals. Our fiction-targeted approach, called TiFi, consists of three phases: (i) category cleaning, by identifying candidate categories that truly represent classes in the domain of interest, (ii) edge cleaning, by selecting subcategory relationships that correspond to class subsumption, and (iii) top-level construction, by mapping classes onto a subset of high-level WordNet categories. A comprehensive evaluation shows that TiFi is able to construct taxonomies for a diverse range of fictional domains such as Lord of the Rings, The Simpsons or Greek Mythology with very high precision and that it outperforms state-of-the-art baselines for taxonomy induction by a substantial margin.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes