Stacking Small Language Models for Generalizability
This addresses the problem of expensive LLMs for users in resource-limited settings, offering a cost-effective alternative, though it appears incremental as it builds on existing fine-tuning and stacking methods.
The paper tackles the high cost and impracticality of large language models (LLMs) in resource-limited settings by introducing fine-tuning stacks of small language models (FSLM), which breaks down reasoning into steps handled by specific models, resulting in lower training and inference costs and improved interpretability, with promising early results on natural language benchmarks.
Recent advances show that large language models (LLMs) generalize strong performance across different natural language benchmarks. However, the large size of LLMs makes training and inference expensive and impractical to run in resource-limited settings. This paper introduces a new approach called fine-tuning stacks of language models (FSLM), which involves stacking small language models (SLM) as an alternative to LLMs. By fine-tuning each SLM to perform a specific task, this approach breaks down high level reasoning into multiple lower-level steps that specific SLMs are responsible for. As a result, FSLM allows for lower training and inference costs, and also improves model interpretability as each SLM communicates with the subsequent one through natural language. By evaluating FSLM on common natural language benchmarks, this paper highlights promising early results toward generalizable performance using FSLM as a cost-effective alternative to LLMs.