Richard C. Gerum

4papers

89citations

Novelty57%

AI Score27

Ranked #162,406 of 201,326 authors (top 81%)#611 in NE (top 54%)

4 Papers

CVSep 6, 2022

Improving the Accuracy and Robustness of CNNs Using a Deep CCA Neural Data Regularizer

Cassidy Pirlot, Richard C. Gerum, Cory Efird et al.

As convolutional neural networks (CNNs) become more accurate at object recognition, their representations become more similar to the primate visual system. This finding has inspired us and other researchers to ask if the implication also runs the other way: If CNN representations become more brain-like, does the network become more accurate? Previous attempts to address this question showed very modest gains in accuracy, owing in part to limitations of the regularization method. To overcome these limitations, we developed a new neural data regularizer for CNNs that uses Deep Canonical Correlation Analysis (DCCA) to optimize the resemblance of the CNN's image representations to that of the monkey visual cortex. Using this new neural data regularizer, we see much larger performance gains in both classification accuracy and within-super-class accuracy, as compared to the previous state-of-the-art neural data regularizers. These networks are also more robust to adversarial attacks than their unregularized counterparts. Together, these results confirm that neural data regularization can push CNN performance higher, and introduces a new method that obtains a larger performance boost.

LGAug 22, 2022

Different Spectral Representations in Optimized Artificial Neural Networks and Brains

Richard C. Gerum, Cassidy Pirlot, Alona Fyshe et al.

Recent studies suggest that artificial neural networks (ANNs) that match the spectral properties of the mammalian visual cortex -- namely, the $\sim 1/n$ eigenspectrum of the covariance matrix of neural activities -- achieve higher object recognition performance and robustness to adversarial attacks than those that do not. To our knowledge, however, no previous work systematically explored how modifying the ANN's spectral properties affects performance. To fill this gap, we performed a systematic search over spectral regularizers, forcing the ANN's eigenspectrum to follow $1/n^α$ power laws with different exponents $α$. We found that larger powers (around 2--3) lead to better validation accuracy and more robustness to adversarial attacks on dense networks. This surprising finding applied to both shallow and deep networks and it overturns the notion that the brain-like spectrum (corresponding to $α\sim 1$) always optimizes ANN performance and/or robustness. For convolutional networks, the best $α$ values depend on the task complexity and evaluation metric: lower $α$ values optimized validation accuracy and robustness to adversarial attack for networks performing a simple object recognition task (categorizing MNIST images of handwritten digits); for a more complex task (categorizing CIFAR-10 natural images), we found that lower $α$ values optimized validation accuracy whereas higher $α$ values optimized adversarial robustness. These results have two main implications. First, they cast doubt on the notion that brain-like spectral properties ($α\sim 1$) \emph{always} optimize ANN performance. Second, they demonstrate the potential for fine-tuned spectral regularizers to optimize a chosen design metric, i.e., accuracy and/or robustness.

NEApr 28, 2020

Integration of Leaky-Integrate-and-Fire-Neurons in Deep Learning Architectures

Richard C. Gerum, Achim Schilling

Up to now, modern Machine Learning is mainly based on fitting high dimensional functions to enormous data sets, taking advantage of huge hardware resources. We show that biologically inspired neuron models such as the Leaky-Integrate-and-Fire (LIF) neurons provide novel and efficient ways of information encoding. They can be integrated in Machine Learning models, and are a potential target to improve Machine Learning performance. Thus, we derived simple update-rules for the LIF units from the differential equations, which are easy to numerically integrate. We apply a novel approach to train the LIF units supervisedly via backpropagation, by assigning a constant value to the derivative of the neuron activation function exclusively for the backpropagation step. This simple mathematical trick helps to distribute the error between the neurons of the pre-connected layer. We apply our method to the IRIS blossoms image data set and show that the training technique can be used to train LIF neurons on image classification tasks. Furthermore, we show how to integrate our method in the KERAS (tensorflow) framework and efficiently run it on GPUs. To generate a deeper understanding of the mechanisms during training we developed interactive illustrations, which we provide online. With this study we want to contribute to the current efforts to enhance Machine Intelligence by integrating principles from biology.

NENov 7, 2019

Sparsity through evolutionary pruning prevents neuronal networks from overfitting

Richard C. Gerum, André Erpenbeck, Patrick Krauss et al.

Modern Machine learning techniques take advantage of the exponentially rising calculation power in new generation processor units. Thus, the number of parameters which are trained to resolve complex tasks was highly increased over the last decades. However, still the networks fail - in contrast to our brain - to develop general intelligence in the sense of being able to solve several complex tasks with only one network architecture. This could be the case because the brain is not a randomly initialized neural network, which has to be trained by simply investing a lot of calculation power, but has from birth some fixed hierarchical structure. To make progress in decoding the structural basis of biological neural networks we here chose a bottom-up approach, where we evolutionarily trained small neural networks in performing a maze task. This simple maze task requires dynamical decision making with delayed rewards. We were able to show that during the evolutionary optimization random severance of connections lead to better generalization performance of the networks compared to fully connected networks. We conclude that sparsity is a central property of neural networks and should be considered for modern Machine learning approaches.