HC CLFeb 14, 2023

ScatterShot: Interactive In-context Example Curation for Text Transformation

Tongshuang Wu, Hua Shen, Daniel S. Weld, Jeffrey Heer, Marco Tulio Ribeiro

CMUMicrosoftUW

arXiv:2302.07346v118.540 citationsh-index: 93Has Code

Originality Highly original

AI Analysis

This addresses the challenge for annotators in efficiently curating effective in-context examples for text transformation tasks, representing an incremental improvement with a novel interactive method.

The paper tackled the problem of users underspecifying in-context functions for LLMs by introducing ScatterShot, an interactive system that improves demonstration set quality, resulting in 4-5 percentage point gains in few-shot functions over random sampling in simulations.

The in-context learning capabilities of LLMs like GPT-3 allow annotators to customize an LLM to their specific tasks with a small number of examples. However, users tend to include only the most obvious patterns when crafting examples, resulting in underspecified in-context functions that fall short on unseen cases. Further, it is hard to know when "enough" examples have been included even for known patterns. In this work, we present ScatterShot, an interactive system for building high-quality demonstration sets for in-context learning. ScatterShot iteratively slices unlabeled data into task-specific patterns, samples informative inputs from underexplored or not-yet-saturated slices in an active learning manner, and helps users label more efficiently with the help of an LLM and the current example set. In simulation studies on two text perturbation scenarios, ScatterShot sampling improves the resulting few-shot functions by 4-5 percentage points over random sampling, with less variance as more examples are added. In a user study, ScatterShot greatly helps users in covering different patterns in the input space and labeling in-context examples more efficiently, resulting in better in-context learning and less user effort.

View on arXiv PDF Code

Similar