Analysis and Explainability of LLMs Via Evolutionary Methods

Shannon K. Gallagher, Swati Rallapalli, Tyler Brooks, Chuck Loughin, Michele Sezgin, Ronald Yurko

arXiv:2605.029308.5

Predicted impact top 41% in NE · last 90 daysOriginality Incremental advance

AI Analysis

For researchers studying LLM lineage and interpretability, this provides a novel framework for analyzing model relationships, though the controlled experiment limits immediate practical impact.

This work adapts evolutionary methods to analyze and explain relationships among large language models (LLMs), demonstrating that estimated evolutionary trees reliably recover ground-truth training tree topology in controlled experiments and identifying important weight layers and dataset contributions.

Evolutionary methods have long been useful for analysis and explanation in genetics, biology, ecology, and related fields. In this work, we extend these methods to neural networks, specifically large language models (LLMs), to better analyze and explain relationships among models. We show how relating weights to genotypes and output text to phenotypes can improve our understanding of model lineage, important datasets, the roles of different model layers, and visualization of model relationships. We demonstrate this in a controlled experiment, where our estimated evolutionary trees reliably recover the topology of the ground-truth training tree. We further identify the most important weight layers according to weight differences and show through phenotypic experiments that one training dataset appears to contribute more useful information than the others. Finally, we generate an unsupervised evolutionary tree of black-box foundation models. Throughout, we provide visualizations that support a clearer understanding of evolutionary relationships among LLMs.

View on arXiv PDF

Similar