Motivation is Something You Need
This work addresses the challenge of high computational costs in training large AI models, offering a novel approach that could benefit researchers and practitioners in machine learning, though it is incremental in its application to existing architectures.
The paper tackles the problem of inefficient training of large models by introducing a dual-model framework inspired by affective neuroscience, which alternates between a base model and a motivated model to enhance performance; empirical results show that this method improves the base model and sometimes surpasses standalone larger models with lower training costs.
This work introduces a novel training paradigm that draws from affective neuroscience. Inspired by the interplay of emotions and cognition in the human brain and more specifically the SEEKING motivational state, we design a dual-model framework where a smaller base model is trained continuously, while a larger motivated model is activated intermittently during predefined "motivation conditions". The framework mimics the emotional state of high curiosity and anticipation of reward in which broader brain regions are recruited to enhance cognitive performance. Exploiting scalable architectures where larger models extend smaller ones, our method enables shared weight updates and selective expansion of network capacity during noteworthy training steps. Empirical evaluation on the image classification task demonstrates that, not only does the alternating training scheme efficiently and effectively enhance the base model compared to a traditional scheme, in some cases, the motivational model also surpasses its standalone counterpart despite seeing less data per epoch. This opens the possibility of simultaneously training two models tailored to different deployment constraints with competitive or superior performance while keeping training cost lower than when training the larger model.