Frequency and Scale Perspectives of Feature Extraction
This work addresses the fundamental problem of understanding feature extraction in neural networks for researchers, but it is incremental as it builds on existing methods with a manually designed approach.
The paper tackles the unclear nature of feature extraction in CNNs by analyzing their sensitivity to frequencies and scales, finding biases and class-specific preferences, and proposes a Gaussian derivative-based architecture that achieves comparable accuracy to vanilla networks on various datasets.
Convolutional neural networks (CNNs) have achieved superior performance but still lack clarity about the nature and properties of feature extraction. In this paper, by analyzing the sensitivity of neural networks to frequencies and scales, we find that neural networks not only have low- and medium-frequency biases but also prefer different frequency bands for different classes, and the scale of objects influences the preferred frequency bands. These observations lead to the hypothesis that neural networks must learn the ability to extract features at various scales and frequencies. To corroborate this hypothesis, we propose a network architecture based on Gaussian derivatives, which extracts features by constructing scale space and employing partial derivatives as local feature extraction operators to separate high-frequency information. This manually designed method of extracting features from different scales allows our GSSDNets to achieve comparable accuracy with vanilla networks on various datasets.