Hierarchical Graph Feature Enhancement with Adaptive Frequency Modulation for Visual Recognition
This work addresses the problem of enhancing structural awareness in visual recognition for computer vision applications, representing an incremental improvement by integrating graph-based reasoning into CNNs.
The paper tackled the limitation of CNNs in modeling complex topological relationships in images by proposing a hierarchical graph feature enhancement framework, which improved recognition performance across multiple datasets including CIFAR-100, PASCAL VOC, and VisDrone.
Convolutional neural networks (CNNs) have demonstrated strong performance in visual recognition tasks, but their inherent reliance on regular grid structures limits their capacity to model complex topological relationships and non-local semantics within images. To address this limita tion, we propose the hierarchical graph feature enhancement (HGFE), a novel framework that integrates graph-based rea soning into CNNs to enhance both structural awareness and feature representation. HGFE builds two complementary levels of graph structures: intra-window graph convolution to cap ture local spatial dependencies and inter-window supernode interactions to model global semantic relationships. Moreover, we introduce an adaptive frequency modulation module that dynamically balances low-frequency and high-frequency signal propagation, preserving critical edge and texture information while mitigating over-smoothing. The proposed HGFE module is lightweight, end-to-end trainable, and can be seamlessly integrated into standard CNN backbone networks. Extensive experiments on CIFAR-100 (classification), PASCAL VOC, and VisDrone (detection), as well as CrackSeg and CarParts (segmentation), validated the effectiveness of the HGFE in improving structural representation and enhancing overall recognition performance.