CLAILGJul 17, 2023

Mini-Giants: "Small" Language Models and Open Source Win-Win

arXiv:2307.08189v211 citationsh-index: 7Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses the high cost and accessibility issues of large language models for researchers and developers, though it is incremental as it builds on existing trends in small model development.

The paper argues that small language models, termed 'mini-giants', offer a cost-effective alternative to large models like ChatGPT, highlighting their growing competence and potential for open-source collaboration in technical, ethical, and social contexts.

ChatGPT is phenomenal. However, it is prohibitively expensive to train and refine such giant models. Fortunately, small language models are flourishing and becoming more and more competent. We call them "mini-giants". We argue that open source community like Kaggle and mini-giants will win-win in many ways, technically, ethically and socially. In this article, we present a brief yet rich background, discuss how to attain small language models, present a comparative study of small language models and a brief discussion of evaluation methods, discuss the application scenarios where small language models are most needed in the real world, and conclude with discussion and outlook.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes