A HMAX with LLC for visual recognition
This work addresses the need for simpler and more efficient neural networks in computer vision, though it is incremental as it builds on existing HMAX architectures.
The paper tackled improving the HMAX neural network for visual recognition by replacing L1 minimization sparse coding with locality-constrained linear coding (LLC) and reintroducing an orientation filter bank, resulting in a state-of-the-art performance of 79.0% on the Caltech-101 dataset without transfer learning.
Today's high performance deep artificial neural networks (ANNs) rely heavily on parameter optimization, which is sequential in nature and even with a powerful GPU, would have taken weeks to train them up for solving challenging tasks [22]. HMAX [17] has demonstrated that a simple high performing network could be obtained without heavy optimization. In this paper, we had improved on the existing best HMAX neural network [12] in terms of structural simplicity and performance. Our design replaces the L1 minimization sparse coding (SC) with a locality-constrained linear coding (LLC) [20] which has a lower computational demand. We also put the simple orientation filter bank back into the front layer of the network replacing PCA. Our system's performance has improved over the existing architecture and reached 79.0% on the challenging Caltech-101 [7] dataset, which is state-of-the-art for ANNs (without transfer learning). From our empirical data, the main contributors to our system's performance include an introduction of partial signal whitening, a spot detector, and a spatial pyramid matching (SPM) [14] layer.