HELM: Hierarchical and Explicit Label Modeling with Graph Learning for Multi-Label Image Classification
This addresses the challenge of multi-path hierarchies and unlabeled data exploitation in remote sensing image classification, representing a strong domain-specific advancement.
The paper tackles the problem of hierarchical multi-label classification in remote sensing by introducing HELM, a framework that uses hierarchy-specific tokens and graph learning to capture label dependencies and leverage unlabeled data, achieving state-of-the-art performance on four datasets.
Hierarchical multi-label classification (HMLC) is essential for modeling complex label dependencies in remote sensing. Existing methods, however, struggle with multi-path hierarchies where instances belong to multiple branches, and they rarely exploit unlabeled data. We introduce HELM (\textit{Hierarchical and Explicit Label Modeling}), a novel framework that overcomes these limitations. HELM: (i) uses hierarchy-specific class tokens within a Vision Transformer to capture nuanced label interactions; (ii) employs graph convolutional networks to explicitly encode the hierarchical structure and generate hierarchy-aware embeddings; and (iii) integrates a self-supervised branch to effectively leverage unlabeled imagery. We perform a comprehensive evaluation on four remote sensing image (RSI) datasets (UCM, AID, DFC-15, MLRSNet). HELM achieves state-of-the-art performance, consistently outperforming strong baselines in both supervised and semi-supervised settings, demonstrating particular strength in low-label scenarios.