Insect cyborgs: Bio-mimetic feature generators improve machine learning accuracy on limited data
This work addresses the challenge of low-data regimes in machine learning, offering a novel approach for improving classifier performance, though it is incremental as it builds on existing biological models and standard ML methods.
The paper tackled the problem of limited training data for machine learning classifiers by using a bio-mimetic feature generator based on an insect olfactory network model, resulting in improved test set accuracy by 6% to 33% and error reduction over 50% compared to baseline methods.
Machine learning (ML) classifiers always benefit from more informative input features. We seek to auto-generate stronger feature sets in order to address the difficulty that ML methods often experience given limited training data. A wide range of biological neural nets (BNNs) excel at fast learning, implying that they are adept at extracting informative features. We can thus look to BNNs for tools to improve ML performance in this low-data regime. The insect olfactory network learns new odors very rapidly, by means of three key elements: A competitive inhibition layer; a high-dimensional sparse plastic layer; and Hebbian updates of synaptic weights. In this work, we deployed MothNet, a computational model of the insect olfactory network, as an automatic feature generator: Attached as a front-end pre-processor, its Readout Neurons provided new features, derived from the original features, for use by standard ML classifiers. We found that these "insect cyborgs", i.e. classifiers that are part-insect model and part-ML method, had significantly better performance than baseline ML methods alone on a vectorized MNIST dataset. The MothNet feature generator also substantially out-performed other feature generating methods such as PCA, PLS, and NNs, as well as pre-training to initialize NN weights. Cyborgs improved relative test set accuracy by an average of 6% to 33% depending on baseline ML accuracy, while relative reduction in test set error exceeded 50% for higher baseline accuracy ML models. These results indicate the potential value of BNN-inspired feature generators in the ML context.