CL AIMar 15, 2023

Large Language Model Is Not a Good Few-shot Information Extractor, but a Good Reranker for Hard Samples!

Yubo Ma, Yixin Cao, YongChing Hong, Aixin Sun

arXiv:2303.08559v227.7223 citationsh-index: 63Has Code

Originality Incremental advance

AI Analysis

This addresses the problem of improving information extraction efficiency and accuracy for researchers and practitioners by combining LLMs and SLMs, though it is incremental as it builds on existing methods.

The study found that large language models (LLMs) are generally ineffective as few-shot information extractors, showing inferior performance, higher latency, and increased costs compared to fine-tuned small language models (SLMs) across nine datasets and four tasks. However, by using LLMs as rerankers for hard samples identified by SLMs in an adaptive filter-then-rerank paradigm, the system achieved an average 2.4% F1-gain with acceptable time and cost.

Large Language Models (LLMs) have made remarkable strides in various tasks. Whether LLMs are competitive few-shot solvers for information extraction (IE) tasks, however, remains an open problem. In this work, we aim to provide a thorough answer to this question. Through extensive experiments on nine datasets across four IE tasks, we demonstrate that current advanced LLMs consistently exhibit inferior performance, higher latency, and increased budget requirements compared to fine-tuned SLMs under most settings. Therefore, we conclude that LLMs are not effective few-shot information extractors in general. Nonetheless, we illustrate that with appropriate prompting strategies, LLMs can effectively complement SLMs and tackle challenging samples that SLMs struggle with. And moreover, we propose an adaptive filter-then-rerank paradigm to combine the strengths of LLMs and SLMs. In this paradigm, SLMs serve as filters and LLMs serve as rerankers. By prompting LLMs to rerank a small portion of difficult samples identified by SLMs, our preliminary system consistently achieves promising improvements (2.4% F1-gain on average) on various IE tasks, with an acceptable time and cost investment.

View on arXiv PDF Code

Similar