LGAIApr 29, 2025

Model Connectomes: A Generational Approach to Data-Efficient Language Models

arXiv:2504.21047v1h-index: 7
Originality Incremental advance
AI Analysis

This work addresses data efficiency in language modeling, which is crucial for reducing computational costs and improving performance in resource-limited settings, though it is preliminary and incremental in nature.

The authors tackled the problem of data inefficiency in language models by proposing a generational approach that mimics biological evolution, where a model inherits a 'connectome' from an outer evolutionary loop before training on a small corpus of 100M tokens. The result showed that this model performed better or on par with control models on NLP tasks and alignment to human data, suggesting it serves as an efficient prior for low-data learning.

Biological neural networks are shaped both by evolution across generations and by individual learning within an organism's lifetime, whereas standard artificial neural networks undergo a single, large training procedure without inherited constraints. In this preliminary work, we propose a framework that incorporates this crucial generational dimension - an "outer loop" of evolution that shapes the "inner loop" of learning - so that artificial networks better mirror the effects of evolution and individual learning in biological organisms. Focusing on language, we train a model that inherits a "model connectome" from the outer evolution loop before exposing it to a developmental-scale corpus of 100M tokens. Compared with two closely matched control models, we show that the connectome model performs better or on par on natural language processing tasks as well as alignment to human behavior and brain data. These findings suggest that a model connectome serves as an efficient prior for learning in low-data regimes - narrowing the gap between single-generation artificial models and biologically evolved neural networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes