Leveraging Multiple Teachers for Test-Time Adaptation of Language-Guided Classifiers
This addresses the challenge of improving reliability and leveraging unlabeled data for language-guided classification in scenarios with multiple teachers or crowds, though it is incremental as it builds on existing methods.
The paper tackles the problem of unpredictable performance variations in language-guided classifiers when using different natural language explanations, and introduces TALC, a framework that adapts these classifiers at test-time using multiple teacher explanations and unlabeled data, achieving a 9.3% relative improvement over a baseline.
Recent approaches have explored language-guided classifiers capable of classifying examples from novel tasks when provided with task-specific natural language explanations, instructions or prompts (Sanh et al., 2022; R. Menon et al., 2022). While these classifiers can generalize in zero-shot settings, their task performance often varies substantially between different language explanations in unpredictable ways (Lu et al., 2022; Gonen et al., 2022). Also, current approaches fail to leverage unlabeled examples that may be available in many scenarios. Here, we introduce TALC, a framework that uses data programming to adapt a language-guided classifier for a new task during inference when provided with explanations from multiple teachers and unlabeled test examples. Our results show that TALC consistently outperforms a competitive baseline from prior work by an impressive 9.3% (relative improvement). Further, we demonstrate the robustness of TALC to variations in the quality and quantity of provided explanations, highlighting its potential in scenarios where learning from multiple teachers or a crowd is involved. Our code is available at: https://github.com/WeiKangda/TALC.git.