LLMs as Function Approximators: Terminology, Taxonomy, and Questions for Evaluation
This provides a conceptual framework for researchers and practitioners to better assess the strengths and weaknesses of large language models, though it is incremental in refining existing evaluation approaches.
The paper tackles the problem of unclear evaluation frameworks for generalist language models by proposing to view them as function approximators based on natural language specifications, which unifies practical and theoretical evaluation aspects including issues like prompt injection and jailbreaking.
Natural Language Processing has moved rather quickly from modelling specific tasks to taking more general pre-trained models and fine-tuning them for specific tasks, to a point where we now have what appear to be inherently generalist models. This paper argues that the resultant loss of clarity on what these models model leads to metaphors like "artificial general intelligences" that are not helpful for evaluating their strengths and weaknesses. The proposal is to see their generality, and their potential value, in their ability to approximate specialist function, based on a natural language specification. This framing brings to the fore questions of the quality of the approximation, but beyond that, also questions of discoverability, stability, and protectability of these functions. As the paper will show, this framing hence brings together in one conceptual framework various aspects of evaluation, both from a practical and a theoretical perspective, as well as questions often relegated to a secondary status (such as "prompt injection" and "jailbreaking").