In-Context Learning Functions with Varying Number of Minima
This work addresses a specific property of functions for ICL, providing incremental insights into its limitations and advantages compared to neural networks.
The study tackled the problem of how In-Context Learning (ICL) in Large Language Models performs when approximating functions with varying numbers of minima, finding that increasing minima degrades ICL performance, but ICL still outperforms a 2-layer Neural Network model and learns faster in all settings.
Large Language Models (LLMs) have proven effective at In-Context Learning (ICL), an ability that allows them to create predictors from labeled examples. Few studies have explored the interplay between ICL and specific properties of functions it attempts to approximate. In our study, we use a formal framework to explore ICL and propose a new task of approximating functions with varying number of minima. We implement a method that allows for producing functions with given inputs as minima. We find that increasing the number of minima degrades ICL performance. At the same time, our evaluation shows that ICL outperforms 2-layer Neural Network (2NN) model. Furthermore, ICL learns faster than 2NN in all settings. We validate the findings through a set of few-shot experiments across various hyperparameter configurations.