CLAIApr 20, 2021

Identify, Align, and Integrate: Matching Knowledge Graphs to Commonsense Reasoning Tasks

arXiv:2104.10193v1805 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of integrating external knowledge for AI systems performing commonsense reasoning, though it is incremental as it builds on existing methods for KG selection and analysis.

The paper tackles the problem of selecting knowledge graphs (KGs) that best align with commonsense reasoning tasks to improve performance, demonstrating that ATOMIC is the best match for SocialIQA and MCScript2.0, while ConceptNet and a WikiHow-based KG are best for Physical IQA across three analysis phases.

Integrating external knowledge into commonsense reasoning tasks has shown progress in resolving some, but not all, knowledge gaps in these tasks. For knowledge integration to yield peak performance, it is critical to select a knowledge graph (KG) that is well-aligned with the given task's objective. We present an approach to assess how well a candidate KG can correctly identify and accurately fill in gaps of reasoning for a task, which we call KG-to-task match. We show this KG-to-task match in 3 phases: knowledge-task identification, knowledge-task alignment, and knowledge-task integration. We also analyze our transformer-based KG-to-task models via commonsense probes to measure how much knowledge is captured in these models before and after KG integration. Empirically, we investigate KG matches for the SocialIQA (SIQA) (Sap et al., 2019b), Physical IQA (PIQA) (Bisk et al., 2020), and MCScript2.0 (Ostermann et al., 2019) datasets with 3 diverse KGs: ATOMIC (Sap et al., 2019a), ConceptNet (Speer et al., 2017), and an automatically constructed instructional KG based on WikiHow (Koupaee and Wang, 2018). With our methods we are able to demonstrate that ATOMIC, an event-inference focused KG, is the best match for SIQA and MCScript2.0, and that the taxonomic ConceptNet and WikiHow-based KGs are the best matches for PIQA across all 3 analysis phases. We verify our methods and findings with human evaluation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes