CLLGSep 4, 2023

Prompting or Fine-tuning? A Comparative Study of Large Language Models for Taxonomy Construction

arXiv:2309.01715v136 citationsh-index: 46
Originality Incremental advance
AI Analysis

This work addresses the problem of automated taxonomy construction for software modeling and NLP practitioners, offering guidance on method selection, though it is incremental as it builds on existing LLM techniques.

The study compared prompting and fine-tuning approaches for constructing taxonomies using large language models, finding that prompting outperforms fine-tuning without explicit training, especially with small datasets, but fine-tuning allows easier post-processing to meet structural constraints.

Taxonomies represent hierarchical relations between entities, frequently applied in various software modeling and natural language processing (NLP) activities. They are typically subject to a set of structural constraints restricting their content. However, manual taxonomy construction can be time-consuming, incomplete, and costly to maintain. Recent studies of large language models (LLMs) have demonstrated that appropriate user inputs (called prompting) can effectively guide LLMs, such as GPT-3, in diverse NLP tasks without explicit (re-)training. However, existing approaches for automated taxonomy construction typically involve fine-tuning a language model by adjusting model parameters. In this paper, we present a general framework for taxonomy construction that takes into account structural constraints. We subsequently conduct a systematic comparison between the prompting and fine-tuning approaches performed on a hypernym taxonomy and a novel computer science taxonomy dataset. Our result reveals the following: (1) Even without explicit training on the dataset, the prompting approach outperforms fine-tuning-based approaches. Moreover, the performance gap between prompting and fine-tuning widens when the training dataset is small. However, (2) taxonomies generated by the fine-tuning approach can be easily post-processed to satisfy all the constraints, whereas handling violations of the taxonomies produced by the prompting approach can be challenging. These evaluation findings provide guidance on selecting the appropriate method for taxonomy construction and highlight potential enhancements for both approaches.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes