FEET: A Framework for Evaluating Embedding Techniques
This work provides a framework for researchers to systematically evaluate and compare foundation models, which is incremental as it organizes existing evaluation practices rather than introducing new methods.
The authors introduced FEET, a standardized evaluation protocol for foundation models, addressing the lack of structured benchmarking by defining three use cases (frozen, few-shot, and fully fine-tuned embeddings) and demonstrating its application through case studies in sentiment analysis and medical domains.
In this study, we introduce FEET, a standardized protocol designed to guide the development and benchmarking of foundation models. While numerous benchmark datasets exist for evaluating these models, we propose a structured evaluation protocol across three distinct scenarios to gain a comprehensive understanding of their practical performance. We define three primary use cases: frozen embeddings, few-shot embeddings, and fully fine-tuned embeddings. Each scenario is detailed and illustrated through two case studies: one in sentiment analysis and another in the medical domain, demonstrating how these evaluations provide a thorough assessment of foundation models' effectiveness in research applications. We recommend this protocol as a standard for future research aimed at advancing representation learning models.