Buzz, Choose, Forget: A Meta-Bandit Framework for Bee-Like Decision Making
This work addresses the specific problem of improving simulations of insect behavior in agroecosystems for biologists and ecologists, representing a domain-specific advancement.
The paper tackles the problem of modeling heterogeneous cognitive strategies in honeybees, where existing imitation learning methods fail when expert policies shift or deviate from optimality. The authors introduce a meta-bandit framework that minimizes predictive loss while identifying effective memory horizons, achieving interpretability and releasing a novel dataset of 80 tracked bees.
We introduce a sequential reinforcement learning framework for imitation learning designed to model heterogeneous cognitive strategies in pollinators. Focusing on honeybees, our approach leverages trajectory similarity to capture and forecast behavior across individuals that rely on distinct strategies: some exploiting numerical cues, others drawing on memory, or being influenced by environmental factors such as weather. Through empirical evaluation, we show that state-of-the-art imitation learning methods often fail in this setting: when expert policies shift across memory windows or deviate from optimality, these models overlook both fast and slow learning behaviors and cannot faithfully reproduce key decision patterns. Moreover, they offer limited interpretability, hindering biological insight. Our contribution addresses these challenges by (i) introducing a model that minimizes predictive loss while identifying the effective memory horizon most consistent with behavioral data, and (ii) ensuring full interpretability to enable biologists to analyze underlying decision-making strategies and finally (iii) providing a mathematical framework linking bee policy search with bandit formulations under varying exploration-exploitation dynamics, and releasing a novel dataset of 80 tracked bees observed under diverse weather conditions. This benchmark facilitates research on pollinator cognition and supports ecological governance by improving simulations of insect behavior in agroecosystems. Our findings shed new light on the learning strategies and memory interplay shaping pollinator decision-making.