LG AI ROOct 30, 2025

Pelican-VL 1.0: A Foundation Brain Model for Embodied Intelligence

Yi Zhang, Che Liu, Xiancong Ren, Hanchu Ni, Shuai Zhang, Zeyuan Ding, Jiayu Hu, Hanzhe Shan, Zhenwei Niu, Zhaoyang Liu, Shuang Liu, Yue Zhao

arXiv:2511.00108v213.06 citationsh-index: 4Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of embedding powerful intelligence into various embodiments for AI and robotics applications, representing a significant but incremental advance in the field.

The authors tackled the problem of developing embodied intelligence by introducing Pelican-VL 1.0, a family of open-source embodied brain models, which achieved a 20.3% performance uplift from its base model and outperformed 100B-level open-source counterparts by 10.6% on embodied benchmarks.

This report presents Pelican-VL 1.0, a new family of open-source embodied brain models with parameter scales ranging from 7 billion to 72 billion. Our explicit mission is clearly stated as: To embed powerful intelligence into various embodiments. Pelican-VL 1.0 is currently the largest-scale open-source embodied multimodal brain model. Its core advantage lies in the in-depth integration of data power and intelligent adaptive learning mechanisms. Specifically, metaloop distilled a high-quality dataset from a raw dataset containing 4+ billion tokens. Pelican-VL 1.0 is trained on a large-scale cluster of 1000+ A800 GPUs, consuming over 50k+ A800 GPU-hours per checkpoint. This translates to a 20.3% performance uplift from its base model and outperforms 100B-level open-source counterparts by 10.6%, placing it on par with leading proprietary systems on well-known embodied benchmarks. We establish a novel framework, DPPO (Deliberate Practice Policy Optimization), inspired by human metacognition to train Pelican-VL 1.0. We operationalize this as a metaloop that teaches the AI to practice deliberately, which is a RL-Refine-Diagnose-SFT loop.

View on arXiv PDF

Similar