CLMay 28, 2023

Learning a Structural Causal Model for Intuition Reasoning in Conversation

Hang Chen, Bingyu Liao, Jing Luo, Wenjing Zhu, Xinyu Yang

arXiv:2305.17727v24.318 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses a critical gap in NLP for conversation reasoning, which is largely unexplored due to the lack of cognitive models, though it appears incremental as it builds on intuition theory and causal methods.

The paper tackles the problem of conversation reasoning in NLP by developing a conversation cognitive model (CCM) and transforming it into a structural causal model (SCM) for utterance-level relation reasoning, with experiments showing significant outperformance over existing methods on synthetic, simulated, and real-world datasets.

Reasoning, a crucial aspect of NLP research, has not been adequately addressed by prevailing models including Large Language Model. Conversation reasoning, as a critical component of it, remains largely unexplored due to the absence of a well-designed cognitive model. In this paper, inspired by intuition theory on conversation cognition, we develop a conversation cognitive model (CCM) that explains how each utterance receives and activates channels of information recursively. Besides, we algebraically transformed CCM into a structural causal model (SCM) under some mild assumptions, rendering it compatible with various causal discovery methods. We further propose a probabilistic implementation of the SCM for utterance-level relation reasoning. By leveraging variational inference, it explores substitutes for implicit causes, addresses the issue of their unobservability, and reconstructs the causal representations of utterances through the evidence lower bounds. Moreover, we constructed synthetic and simulated datasets incorporating implicit causes and complete cause labels, alleviating the current situation where all available datasets are implicit-causes-agnostic. Extensive experiments demonstrate that our proposed method significantly outperforms existing methods on synthetic, simulated, and real-world datasets. Finally, we analyze the performance of CCM under latent confounders and propose theoretical ideas for addressing this currently unresolved issue.

View on arXiv PDF Code

Similar