Zatom-1: A Multimodal Flow Foundation Model for 3D Molecules and Materials

arXiv:2602.22251v11 citations
Originality Incremental advance
AI Analysis

This work addresses the need for general-purpose AI in chemistry by enabling cross-domain transfer, though it builds incrementally on existing multimodal and flow matching techniques.

The paper tackles the problem of limited representation sharing and transfer in 3D chemical modeling by introducing Zatom-1, a foundation model that unifies generative and predictive learning for both molecules and materials, achieving performance matching or outperforming specialized baselines and reducing generative inference time by more than an order of magnitude.

General-purpose 3D chemical modeling encompasses molecules and materials, requiring both generative and predictive capabilities. However, most existing AI approaches are optimized for a single domain (molecules or materials) and a single task (generation or prediction), which limits representation sharing and transfer. We introduce Zatom-1, the first foundation model that unifies generative and predictive learning of 3D molecules and materials. Zatom-1 is a Transformer trained with a multimodal flow matching objective that jointly models discrete atom types and continuous 3D geometries. This approach supports scalable pretraining with predictable gains as model capacity increases, while enabling fast and stable sampling. We use joint generative pretraining as a universal initialization for downstream multi-task prediction of properties, energies, and forces. Empirically, Zatom-1 matches or outperforms specialized baselines on both generative and predictive benchmarks, while reducing the generative inference time by more than an order of magnitude. Our experiments demonstrate positive predictive transfer between chemical domains from joint generative pretraining: modeling materials during pretraining improves molecular property prediction accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes