AICLOct 12, 2024

Zero-shot Commonsense Reasoning over Machine Imagination

arXiv:2410.09329v124 citationsh-index: 7EMNLP
Originality Incremental advance
AI Analysis

This addresses a key limitation in commonsense reasoning for AI systems, though it is an incremental advance by integrating visual signals into existing language models.

The paper tackles the problem of human reporting bias in zero-shot commonsense reasoning by introducing a framework that complements textual inputs with machine-generated images, resulting in large-margin performance improvements on diverse benchmarks.

Recent approaches to zero-shot commonsense reasoning have enabled Pre-trained Language Models (PLMs) to learn a broad range of commonsense knowledge without being tailored to specific situations. However, they often suffer from human reporting bias inherent in textual commonsense knowledge, leading to discrepancies in understanding between PLMs and humans. In this work, we aim to bridge this gap by introducing an additional information channel to PLMs. We propose Imagine (Machine Imagination-based Reasoning), a novel zero-shot commonsense reasoning framework designed to complement textual inputs with visual signals derived from machine-generated images. To achieve this, we enhance PLMs with imagination capabilities by incorporating an image generator into the reasoning process. To guide PLMs in effectively leveraging machine imagination, we create a synthetic pre-training dataset that simulates visual question-answering. Our extensive experiments on diverse reasoning benchmarks and analysis show that Imagine outperforms existing methods by a large margin, highlighting the strength of machine imagination in mitigating reporting bias and enhancing generalization capabilities.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes