CLAug 7, 2023

A Cross-Domain Evaluation of Approaches for Causal Knowledge Extraction

Anik Saha, Oktie Hassanzadeh, Alex Gittens, Jian Ni, Kavitha Srinivas, Bulent Yener

IBM

arXiv:2308.03891v10.91 citationsh-index: 40Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of extracting causal relations from text for language understanding and knowledge discovery, but it is incremental as it builds on existing methods.

The paper tackled causal knowledge extraction by evaluating sequence tagging and span-based models, finding that BERT embeddings significantly boost performance and span-based models outperform sequence tagging across four diverse datasets.

Causal knowledge extraction is the task of extracting relevant causes and effects from text by detecting the causal relation. Although this task is important for language understanding and knowledge discovery, recent works in this domain have largely focused on binary classification of a text segment as causal or non-causal. In this regard, we perform a thorough analysis of three sequence tagging models for causal knowledge extraction and compare it with a span based approach to causality extraction. Our experiments show that embeddings from pre-trained language models (e.g. BERT) provide a significant performance boost on this task compared to previous state-of-the-art models with complex architectures. We observe that span based models perform better than simple sequence tagging models based on BERT across all 4 data sets from diverse domains with different types of cause-effect phrases.

View on arXiv PDF Code

Similar