LGApr 14Code
RoleMAG: Learning Neighbor Roles in Multimodal GraphsYilong Zuo, Xunkai Li, Zhihan Zhang et al.
Multimodal attributed graphs (MAGs) combine multimodal node attributes with structured relations. However, existing methods usually perform shared message passing on a single graph and implicitly assume that the same neighbors are equally useful for all modalities. In practice, neighbors that benefit one modality may interfere with another, blurring modality-specific signals under shared propagation. To address this issue, we propose RoleMAG, a multimodal graph framework that learns how different neighbors should participate in propagation. Concretely, RoleMAG distinguishes whether a neighbor should provide shared, complementary, or heterophilous signals, and routes them through separate propagation channels. This enables cross-modal completion from complementary neighbors while keeping heterophilous ones out of shared smoothing. Extensive experiments on three graph-centric MAG benchmarks show that RoleMAG achieves the best results on RedditS and Bili\_Dance, while remaining competitive on Toys. Ablation, robustness, and efficiency analyses further support the effectiveness of the proposed role-aware propagation design. Our code is available at https://anonymous.4open.science/r/RoleMAG-7EE0/
LGFeb 5Code
OpenMAG: A Comprehensive Benchmark for Multimodal-Attributed GraphChenxi Wan, Xunkai Li, Yilong Zuo et al.
Multimodal-Attributed Graph (MAG) learning has achieved remarkable success in modeling complex real-world systems by integrating graph topology with rich attributes from multiple modalities. With the rapid proliferation of novel MAG models capable of handling intricate cross-modal semantics and structural dependencies, establishing a rigorous and unified evaluation standard has become imperative. Although existing benchmarks have facilitated initial progress, they exhibit critical limitations in domain coverage, encoder flexibility, model diversity, and task scope, presenting significant challenges to fair evaluation. To bridge this gap, we present OpenMAG, a comprehensive benchmark that integrates 19 datasets across 6 domains and incorporates 16 encoders to support both static and trainable feature encoding. OpenMAG further implements a standardized library of 24 state-of-the-art models and supports 8 downstream tasks, enabling fair comparisons within a unified framework. Through systematic assessment of necessity, data quality, effectiveness, robustness, and efficiency, we derive 14 fundamental insights into MAG learning to guide future advancements. Our code is available at https://github.com/YUKI-N810/OpenMAG.
LGJan 30
OptiMAG: Structure-Semantic Alignment via Unbalanced Optimal TransportYilong Zuo, Xunkai Li, Zhihan Zhang et al.
Multimodal Attributed Graphs (MAGs) have been widely adopted for modeling complex systems by integrating multi-modal information, such as text and images, on nodes. However, we identify a discrepancy between the implicit semantic structure induced by different modality embeddings and the explicit graph structure. For instance, neighbors in the explicit graph structure may be close in one modality but distant in another. Since existing methods typically perform message passing over the fixed explicit graph structure, they inadvertently aggregate dissimilar features, introducing modality-specific noise and impeding effective node representation learning. To address this, we propose OptiMAG, an Unbalanced Optimal Transport-based regularization framework. OptiMAG employs the Fused Gromov-Wasserstein distance to explicitly guide cross-modal structural consistency within local neighborhoods, effectively mitigating structural-semantic conflicts. Moreover, a KL divergence penalty enables adaptive handling of cross-modal inconsistencies. This framework can be seamlessly integrated into existing multimodal graph models, acting as an effective drop-in regularizer. Experiments demonstrate that OptiMAG consistently outperforms baselines across multiple tasks, ranging from graph-centric tasks (e.g., node classification, link prediction) to multimodal-centric generation tasks (e.g., graph2text, graph2image). The source code will be available upon acceptance.
LGOct 10, 2025
When LLM Agents Meet Graph Optimization: An Automated Data Quality Improvement ApproachZhihan Zhang, Xunkai Li, Yilong Zuo et al.
Text-attributed graphs (TAGs) have become a key form of graph-structured data in modern data management and analytics, combining structural relationships with rich textual semantics for diverse applications. However, the effectiveness of analytical models, particularly graph neural networks (GNNs), is highly sensitive to data quality. Our empirical analysis shows that both conventional and LLM-enhanced GNNs degrade notably under textual, structural, and label imperfections, underscoring TAG quality as a key bottleneck for reliable analytics. Existing studies have explored data-level optimization for TAGs, but most focus on specific degradation types and target a single aspect like structure or label, lacking a systematic and comprehensive perspective on data quality improvement. To address this gap, we propose LAGA (Large Language and Graph Agent), a unified multi-agent framework for comprehensive TAG quality optimization. LAGA formulates graph quality control as a data-centric process, integrating detection, planning, action, and evaluation agents into an automated loop. It holistically enhances textual, structural, and label aspects through coordinated multi-modal optimization. Extensive experiments on 5 datasets and 16 baselines across 9 scenarios demonstrate the effectiveness, robustness and scalability of LAGA, confirming the importance of data-centric quality optimization for reliable TAG analytics.