AIIROct 17, 2024

Disjointness Violations in Wikidata

arXiv:2410.13707v22 citationsh-index: 57KGSWC
Originality Synthesis-oriented
AI Analysis

This work addresses data quality issues in Wikidata, which is incremental as it builds on existing constraint-checking methods for knowledge bases.

The paper tackled the problem of disjointness violations in Wikidata, a large community-managed knowledge base, by analyzing modeling patterns, identifying causes, and providing formulas to detect and fix conflicting information, but did not report concrete numerical results.

Disjointness checks are among the most important constraint checks in a knowledge base and can be used to help detect and correct incorrect statements and internal contradictions. Wikidata is a very large, community-managed knowledge base. Because of both its size and construction, Wikidata contains many incorrect statements and internal contradictions. We analyze the current modeling of disjointness on Wikidata, identify patterns that cause these disjointness violations and categorize them. We use SPARQL queries to identify each ``culprit'' causing a disjointness violation and lay out formulas to identify and fix conflicting information. We finally discuss how disjointness information could be better modeled and expanded in Wikidata in the future.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes