CLAIJan 1, 2021

DISCOS: Bridging the Gap between Discourse Knowledge and Commonsense Knowledge

arXiv:2101.00154v248 citationsHas Code
Originality Highly original
AI Analysis

This work addresses the high cost and limited scale of commonsense knowledge acquisition for AI systems by leveraging existing linguistic resources, offering a more efficient alternative to human annotation or generation models.

This paper introduces DISCOS, a framework that converts discourse knowledge from the ASER knowledge graph into if-then commonsense knowledge, similar to ATOMIC. The method successfully acquires 3.4 million ATOMIC-like inferential commonsense knowledge without additional human annotation, outperforming previous supervised approaches in novelty and diversity while maintaining comparable quality.

Commonsense knowledge is crucial for artificial intelligence systems to understand natural language. Previous commonsense knowledge acquisition approaches typically rely on human annotations (for example, ATOMIC) or text generation models (for example, COMET.) Human annotation could provide high-quality commonsense knowledge, yet its high cost often results in relatively small scale and low coverage. On the other hand, generation models have the potential to automatically generate more knowledge. Nonetheless, machine learning models often fit the training data well and thus struggle to generate high-quality novel knowledge. To address the limitations of previous approaches, in this paper, we propose an alternative commonsense knowledge acquisition framework DISCOS (from DIScourse to COmmonSense), which automatically populates expensive complex commonsense knowledge to more affordable linguistic knowledge resources. Experiments demonstrate that we can successfully convert discourse knowledge about eventualities from ASER, a large-scale discourse knowledge graph, into if-then commonsense knowledge defined in ATOMIC without any additional annotation effort. Further study suggests that DISCOS significantly outperforms previous supervised approaches in terms of novelty and diversity with comparable quality. In total, we can acquire 3.4M ATOMIC-like inferential commonsense knowledge by populating ATOMIC on the core part of ASER. Codes and data are available at https://github.com/HKUST-KnowComp/DISCOS-commonsense.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes