CL AISep 7, 2023

FLM-101B: An Open LLM and How to Train It with $100K Budget

Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Xuying Meng, Siqi Fan, Peng Han, Jing Li, Li Du, Bowen Qin, Zheng Zhang, Aixin Sun

TencentTsinghua

arXiv:2309.03852v38.929 citationsh-index: 63Has Code

Originality Highly original

AI Analysis

This work addresses the financial and environmental costs of LLM pre-training for the AI research community, offering a more efficient approach.

The paper tackles the high computational cost of training large language models (LLMs) by introducing FLM-101B, a model trained with a progressive growth strategy on a $100K budget, achieving 80% of baseline performance with only 10% of the floating-point operations.

Large language models (LLMs) are considered important approaches towards foundational machine intelligence, achieving remarkable success in Natural Language Processing and multimodal tasks, among others. However, the carbon footprints and financial costs originating from heavy pre-training computation is a non-negligible issue. Progressive training methods, inspired by the neurogenesis process that grows neural structures, have shown potential to accelerate LLM pre-training. However, the algorithms, implementation, and practices for progressively training LLMs beyond 100B parameters remain underexplored. In this paper, we show that our model, namely FLM-101B, trained with our growth strategy under a budget of \$100K, reaches 80\% of the baselines' performances with only 10\% of their floating-point operations. We believe that further studies on progressive training will benefit the community by cutting down the costs and promoting green AI. The checkpoint of FLM-101B is released at https://huggingface.co/CofeAI/FLM-101B.

View on arXiv PDF

Similar