CLAIApr 17, 2023

From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning

arXiv:2304.07995v119 citationsh-index: 29
Originality Incremental advance
AI Analysis

This work addresses the challenge of enhancing instruction tuning for better zero-shot performance in AI, particularly in table reasoning, though it is incremental as it builds on existing methods with a new task type.

The paper tackles the problem of improving zero-shot generalization in language models by incorporating symbolic tasks into instruction tuning, demonstrating that a 3B model outperforms 175B GPT-3 and ChatGPT in zero-shot table reasoning across four benchmarks.

Fine-tuning language models on tasks with instructions has demonstrated potential in facilitating zero-shot generalization to unseen tasks. In this paper, we introduce a straightforward yet effective method for enhancing instruction tuning by employing symbolic tasks. Compared to crowdsourced human tasks or model-generated tasks, symbolic tasks present a unique advantage as they can be easily generated in vast quantities, theoretically providing an infinite supply of high-quality training instances. To explore the potential of symbolic tasks, we carry out an extensive case study on the representative symbolic task of SQL execution. Empirical results on various benchmarks validate that the integration of SQL execution leads to significant improvements in zero-shot scenarios, particularly in table reasoning. Notably, our 3B model surpasses both the 175B GPT-3 and ChatGPT in zero-shot table reasoning across four benchmarks. Furthermore, experimental results on BBH (27 tasks) and MMLU (57 tasks) reveal that language models can be enhanced through symbolic tasks without compromising their generality. We hope that our paper serves as a catalyst, inspiring increased efforts to incorporate symbolic tasks in instruction tuning.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes