CLAILGOct 11, 2023

KwaiYiiMath: Technical Report

arXiv:2310.07488v24 citationsh-index: 15
AI Analysis

This work addresses the problem of enhancing mathematical reasoning for users of LLMs, but it is incremental as it builds on existing methods and datasets.

The authors tackled improving mathematical reasoning in large language models by fine-tuning KwaiYiiBase1 with SFT and RLHF on English and Chinese tasks, achieving state-of-the-art performance on benchmarks like GSM8k, CMath, and their new KMath test set.

Recent advancements in large language models (LLMs) have demonstrated remarkable abilities in handling a variety of natural language processing (NLP) downstream tasks, even on mathematical tasks requiring multi-step reasoning. In this report, we introduce the KwaiYiiMath which enhances the mathematical reasoning abilities of KwaiYiiBase1, by applying Supervised Fine-Tuning (SFT) and Reinforced Learning from Human Feedback (RLHF), including on both English and Chinese mathematical tasks. Meanwhile, we also constructed a small-scale Chinese primary school mathematics test set (named KMath), consisting of 188 examples to evaluate the correctness of the problem-solving process generated by the models. Empirical studies demonstrate that KwaiYiiMath can achieve state-of-the-art (SOTA) performance on GSM8k, CMath, and KMath compared with the similar size models, respectively.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes