Polarity is all you need to learn and transfer faster
This work addresses the challenge of improving learning efficiency in AI systems, potentially reducing computational costs, but it appears incremental as it builds on known concepts of weight initialization.
The paper tackles the problem of artificial intelligences requiring excessive training data and computation compared to natural intelligences by investigating weight polarity as a key design principle, demonstrating that setting weight polarities a priori enables networks to learn faster with less data in simulations and image classification tasks, though it can be disadvantageous in some situations.
Natural intelligences (NIs) thrive in a dynamic world - they learn quickly, sometimes with only a few samples. In contrast, artificial intelligences (AIs) typically learn with a prohibitive number of training samples and computational power. What design principle difference between NI and AI could contribute to such a discrepancy? Here, we investigate the role of weight polarity: development processes initialize NIs with advantageous polarity configurations; as NIs grow and learn, synapse magnitudes update, yet polarities are largely kept unchanged. We demonstrate with simulation and image classification tasks that if weight polarities are adequately set a priori, then networks learn with less time and data. We also explicitly illustrate situations in which a priori setting the weight polarities is disadvantageous for networks. Our work illustrates the value of weight polarities from the perspective of statistical and computational efficiency during learning.