HKT: A Biologically Inspired Framework for Modular Hereditary Knowledge Transfer in Neural Networks
This provides a scalable solution for deploying high-performance neural networks in resource-constrained environments, though it is incremental as it builds on existing knowledge transfer methods.
The paper tackles the problem of improving small neural networks' performance without sacrificing compactness by introducing the Hereditary Knowledge Transfer (HKT) framework, which uses a biologically inspired method to selectively transfer features from a larger parent model, resulting in consistent outperformance over standard distillation across vision tasks like optical flow and image classification.
A prevailing trend in neural network research suggests that model performance improves with increasing depth and capacity - often at the cost of integrability and efficiency. In this paper, we propose a strategy to optimize small, deployable models by enhancing their capabilities through structured knowledge inheritance. We introduce Hereditary Knowledge Transfer (HKT), a biologically inspired framework for modular and selective transfer of task-relevant features from a larger, pretrained parent network to a smaller child model. Unlike standard knowledge distillation, which enforces uniform imitation of teacher outputs, HKT draws inspiration from biological inheritance mechanisms - such as memory RNA transfer in planarians - to guide a multi-stage process of feature transfer. Neural network blocks are treated as functional carriers, and knowledge is transmitted through three biologically motivated components: Extraction, Transfer, and Mixture (ETM). A novel Genetic Attention (GA) mechanism governs the integration of inherited and native representations, ensuring both alignment and selectivity. We evaluate HKT across diverse vision tasks, including optical flow (Sintel, KITTI), image classification (CIFAR-10), and semantic segmentation (LiTS), demonstrating that it significantly improves child model performance while preserving its compactness. The results show that HKT consistently outperforms conventional distillation approaches, offering a general-purpose, interpretable, and scalable solution for deploying high-performance neural networks in resource-constrained environments.