4.5MLMar 24
Decorrelation, Diversity, and Emergent Intelligence: The Isomorphism Between Social Insect Colonies and Ensemble Machine LearningErnest Fokoué, Gregory Babbitt, Yuval Levental
Social insect colonies and ensemble machine learning methods represent two of the most successful examples of decentralized information processing in nature and computation respectively. Here we develop a rigorous mathematical framework demonstrating that ant colony decision-making and random forest learning are isomorphic under a common formalism of \textbf{stochastic ensemble intelligence}. We show that the mechanisms by which genetically identical ants achieve functional differentiation -- through stochastic response to local cues and positive feedback -- map precisely onto the bootstrap aggregation and random feature subsampling that decorrelate decision trees. Using tools from Bayesian inference, multi-armed bandit theory, and statistical learning theory, we prove that both systems implement identical variance reduction strategies through decorrelation of identical units. We derive explicit mappings between ant recruitment rates and tree weightings, pheromone trail reinforcement and out-of-bag error estimation, and quorum sensing and prediction averaging. This isomorphism suggests that collective intelligence, whether biological or artificial, emerges from a universal principle: \textbf{randomized identical agents + diversity-enforcing mechanisms $\rightarrow$ emergent optimality}.
16.0NEApr 3
Isomorphic Functionalities between Ant Colony and Ensemble Learning: Part III -- Gradient Descent, Neural Plasticity, and the Emergence of Deep IntelligenceErnest Fokoué, Gregory Babbitt, Yuval Levental
In Parts I and II of this series, we established isomorphisms between ant colony decision-making and two major families of ensemble learning: random forests (parallel, variance reduction) and boosting (sequential, bias reduction). Here we complete the trilogy by demonstrating that the fundamental learning algorithm underlying deep neural networks -- stochastic gradient descent -- is mathematically isomorphic to the generational learning dynamics of ant colonies. We prove that pheromone evolution across generations follows the same update equations as weight evolution during gradient descent, with evaporation rates corresponding to learning rates, colony fitness corresponding to negative loss, and recruitment waves corresponding to backpropagation passes. We further show that neural plasticity mechanisms -- long-term potentiation, long-term depression, synaptic pruning, and neurogenesis -- have direct analogs in colony-level adaptation: trail reinforcement, evaporation, abandonment, and new trail formation. Comprehensive simulations confirm that ant colonies trained on environmental tasks exhibit learning curves indistinguishable from neural networks trained on analogous problems. This final isomorphism reveals that all three major paradigms of machine learning -- parallel ensembles, sequential ensembles, and gradient-based deep learning -- have direct analogs in the collective intelligence of social insects, suggesting a unified theory of learning that transcends substrate. The ant colony, we conclude, is not merely analogous to learning algorithms; it is a living embodiment of the fundamental principles of learning itself.
3.1MLMar 25
Isomorphic Functionalities between Ant Colony and Ensemble Learning: Part II-On the Strength of Weak Learnability and the Boosting ParadigmErnest Fokoué, Gregory Babbitt, Yuval Levental
In Part I of this series, we established a rigorous mathematical isomorphism between ant colony decision-making and random forest learning, demonstrating that variance reduction through decorrelation is a universal principle shared by biological and computational ensembles. Here we turn to the complementary mechanism: bias reduction through adaptive weighting. Just as boosting algorithms sequentially focus on difficult instances, ant colonies dynamically amplify successful foraging paths through pheromone-mediated recruitment. We prove that these processes are mathematically isomorphic, establishing that the fundamental theorem of weak learnability has a direct analog in colony decision-making. We develop a formal mapping between AdaBoost's adaptive reweighting and ant recruitment dynamics, show that the margin theory of boosting corresponds to the stability of quorum decisions, and demonstrate through comprehensive simulation that ant colonies implementing adaptive recruitment achieve the same bias-reduction benefits as boosting algorithms. This completes a unified theory of ensemble intelligence, revealing that both variance reduction (Part I) and bias reduction (Part II) are manifestations of the same underlying mathematical principles governing collective intelligence in biological and computational systems.
CVFeb 17
Can Vision-Language Models See Squares? Text-Recognition Mediates Spatial Reasoning Across Three Model FamiliesYuval Levental
We present a simple experiment that exposes a fundamental limitation in vision-language models (VLMs): the inability to accurately localize filled cells in binary grids when those cells lack textual identity. We generate fifteen 15x15 grids with varying density (10.7%-41.8% filled cells) and render each as two image types -- text symbols (. and #) and filled squares without gridlines -- then ask three frontier VLMs (Claude Opus, ChatGPT 5.2, and Gemini 3 Thinking) to transcribe them. In the text-symbol condition, Claude and ChatGPT achieve approximately 91% cell accuracy and 84% F1, while Gemini achieves 84% accuracy and 63% F1. In the filled-squares condition, all three models collapse to 60-73% accuracy and 29-39% F1. Critically, all conditions pass through the same visual encoder -- the text symbols are images, not tokenized text. The text-vs-squares F1 gap ranges from 34 to 54 points across models, demonstrating that VLMs behave as if they possess a high-fidelity text-recognition pathway for spatial reasoning that dramatically outperforms their native visual pathway. Each model exhibits a distinct failure mode in the squares condition -- systematic under-counting (Claude), massive over-counting (ChatGPT), and template hallucination (Gemini) -- but all share the same underlying deficit: severely degraded spatial localization for non-textual visual elements.