CL AI LGJul 17, 2023

Mini-Giants: "Small" Language Models and Open Source Win-Win

Zhengping Zhou, Lezhi Li, Xinxi Chen, Andy Li

arXiv:2307.08189v21.311 citationsh-index: 7Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the high cost and accessibility issues of large language models for researchers and developers, though it is incremental as it builds on existing trends in small model development.

The paper argues that small language models, termed 'mini-giants', offer a cost-effective alternative to large models like ChatGPT, highlighting their growing competence and potential for open-source collaboration in technical, ethical, and social contexts.

ChatGPT is phenomenal. However, it is prohibitively expensive to train and refine such giant models. Fortunately, small language models are flourishing and becoming more and more competent. We call them "mini-giants". We argue that open source community like Kaggle and mini-giants will win-win in many ways, technically, ethically and socially. In this article, we present a brief yet rich background, discuss how to attain small language models, present a comparative study of small language models and a brief discussion of evaluation methods, discuss the application scenarios where small language models are most needed in the real world, and conclude with discussion and outlook.

View on arXiv PDF

Similar