AMOSL: Adaptive Modality-wise Structure Learning in Multi-view Graph Neural Networks For Enhanced Unified Representation
This addresses modality discrepancies in MVGNNs for enhanced graph representation learning, though it appears incremental as it builds on existing MVGNN frameworks.
The paper tackles the problem of multi-view graph neural networks (MVGNNs) assuming identical local topology across modalities, which hinders modality fusion and representation denoising, by proposing AMoSL, an adaptive modality-wise structure learning method that improves graph classification accuracy on six benchmark datasets.
While Multi-view Graph Neural Networks (MVGNNs) excel at leveraging diverse modalities for learning object representation, existing methods assume identical local topology structures across modalities that overlook real-world discrepancies. This leads MVGNNs straggles in modality fusion and representations denoising. To address these issues, we propose adaptive modality-wise structure learning (AMoSL). AMoSL captures node correspondences between modalities via optimal transport, and jointly learning with graph embedding. To enable efficient end-to-end training, we employ an efficient solution for the resulting complex bilevel optimization problem. Furthermore, AMoSL adapts to downstream tasks through unsupervised learning on inter-modality distances. The effectiveness of AMoSL is demonstrated by its ability to train more accurate graph classifiers on six benchmark datasets.