From Cross-Task Examples to In-Task Prompts: A Graph-Based Pseudo-Labeling Framework for In-context Learning
This work addresses the challenge of high labeling costs for new or difficult tasks in in-context learning, offering a cost-efficient solution that is incremental in nature.
The paper tackles the problem of costly data labeling for in-context learning by proposing a two-stage pipeline that uses cross-task examples to pseudo-label a small set with an LLM and then propagates labels via a graph-based method, achieving strong performance across five tasks while reducing labeling costs.
The capability of in-context learning (ICL) enables large language models (LLMs) to perform novel tasks without parameter updates by conditioning on a few input-output examples. However, collecting high-quality examples for new or challenging tasks can be costly and labor-intensive. In this work, we propose a cost-efficient two-stage pipeline that reduces reliance on LLMs for data labeling. Our approach first leverages readily available cross-task examples to prompt an LLM and pseudo-label a small set of target task instances. We then introduce a graph-based label propagation method that spreads label information to the remaining target examples without additional LLM queries. The resulting fully pseudo-labeled dataset is used to construct in-task demonstrations for ICL. This pipeline combines the flexibility of cross-task supervision with the scalability of LLM-free propagation. Experiments across five tasks demonstrate that our method achieves strong performance while lowering labeling costs.