CL AI LGSep 5, 2025

PLaMo 2 Technical Report

Preferred Networks, Kaizaburo Chubachi, Yasuhiro Fujita, Shinichi Hemmi, Yuta Hirokawa, Kentaro Imajo, Toshiki Kataoka, Goro Kobayashi, Kenichi Maehashi, Calvin Metzger, Hiroaki Mikami, Shogo Murai

arXiv:2509.04897v24.91 citationsh-index: 7

Originality Incremental advance

AI Analysis

This addresses the challenge of data scarcity and computational inefficiency for Japanese language processing, though it is incremental as it builds on existing architectures and methods.

The authors tackled the problem of developing efficient Japanese-focused large language models by introducing PLaMo 2, which uses a hybrid Samba-based architecture and efficient pruning to produce an 8B model that matches the performance of their previous 100B model, achieving state-of-the-art results on Japanese benchmarks.

In this report, we introduce PLaMo 2, a series of Japanese-focused large language models featuring a hybrid Samba-based architecture that transitions to full attention via continual pre-training to support 32K token contexts. Training leverages extensive synthetic corpora to overcome data scarcity, while computational efficiency is achieved through weight reuse and structured pruning. This efficient pruning methodology produces an 8B model that achieves performance comparable to our previous 100B model. Post-training further refines the models using a pipeline of supervised fine-tuning (SFT) and direct preference optimization (DPO), enhanced by synthetic Japanese instruction data and model merging techniques. Optimized for inference using vLLM and quantization with minimal accuracy loss, the PLaMo 2 models achieve state-of-the-art results on Japanese benchmarks, outperforming similarly-sized open models in instruction-following, language fluency, and Japanese-specific knowledge.

View on arXiv PDF

Similar