CLAICVMay 11

Phoenix-VL 1.5 Medium Technical Report

arXiv:2605.1039132.9
Predicted impact top 72% in CL · last 90 daysOriginality Synthesis-oriented
AI Analysis

This work provides a sovereign AI model for Singapore with strong domain-specific performance, but the approach is incremental, adapting an existing model with localized data.

Phoenix-VL 1.5 Medium is a 123B-parameter multimodal and multilingual foundation model adapted to Singapore and regional languages, achieving state-of-the-art performance on Singapore-specific benchmarks while maintaining global competitiveness on general multimodal and STEM tasks.

We introduce Phoenix-VL 1.5 Medium, a 123B-parameter natively multimodal and multilingual foundation model, adapted to regional languages and the Singapore context. Developed as a sovereign AI asset, it demonstrates that deep domain adaptation can be achieved with minimal degradation to broad-spectrum intelligence and alignment. Continued pretraining was performed on Mistral Medium 3.1 using a localized 1-trillion tokens multimodal corpus, followed by a 250-billion tokens long-context extension phase. Subsequent post-training incorporated a novel human-annotated Singapore multimodal dataset and curated textual corpus on Singapore culture, knowledge, and legislation, totaling 22-billion tokens. An additional 5 billion tokens of model alignment was performed through Online Direct Preference Optimization. Phoenix-VL 1.5 Medium achieves state-of-the-art performance for its size on Singapore multimodal, legal, and government policy benchmarks while remaining globally competitive on general multimodal intelligence, multilingual, and STEM benchmarks. We also introduce a novel evaluation suite encompassing localized knowledge benchmarks and an institutionally aligned model behavior and safety framework. We report the data curation principles, training methodology, and highlight benchmark and inference performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes