AbductionRules: Training Transformers to Explain Unexpected Inputs
This work addresses abductive reasoning for AI interpretability and common-sense reasoning, but it is incremental as it builds on existing Transformer capabilities.
The paper tackled the problem of abductive reasoning in Transformers, which is underexplored despite its applications, by introducing AbductionRules datasets and finetuning models, finding that models learned generalizable techniques but also exploited data structure.
Transformers have recently been shown to be capable of reliably performing logical reasoning over facts and rules expressed in natural language, but abductive reasoning - inference to the best explanation of an unexpected observation - has been underexplored despite significant applications to scientific discovery, common-sense reasoning, and model interpretability. We present AbductionRules, a group of natural language datasets designed to train and test generalisable abduction over natural-language knowledge bases. We use these datasets to finetune pretrained Transformers and discuss their performance, finding that our models learned generalisable abductive techniques but also learned to exploit the structure of our data. Finally, we discuss the viability of this approach to abductive reasoning and ways in which it may be improved in future work.