CLApr 24, 2024

Nyonic Technical Report

Junfeng Tian, Rui Wang, Cong Li, Yudong Zhou, Jun Liu, Jun Wang

arXiv:2404.15702v11 citationsh-index: 2Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the need for efficient and adaptable custom language models, but it appears incremental with standard techniques and competitive rather than SOTA results.

The report tackles the development of a language model for custom large language models, achieving competitive performance on multilingual and English benchmarks with the Wonton 7B model.

This report details the development and key achievements of our latest language model designed for custom large language models. The advancements introduced include a novel Online Data Scheduler that supports flexible training data adjustments and curriculum learning. The model's architecture is fortified with state-of-the-art techniques such as Rotary Positional Embeddings, QK-LayerNorm, and a specially crafted multilingual tokenizer to enhance stability and performance. Moreover, our robust training framework incorporates advanced monitoring and rapid recovery features to ensure optimal efficiency. Our Wonton 7B model has demonstrated competitive performance on a range of multilingual and English benchmarks. Future developments will prioritize narrowing the performance gap with more extensively trained models, thereby enhancing the model's real-world efficacy and adaptability.GitHub: \url{https://github.com/nyonicai/nyonic-public}

View on arXiv PDF Code

Similar