LGSep 13, 2024Code
Molecular Graph Representation Learning via Structural Similarity InformationChengyu Yao, Hong Huang, Hang Gao et al.
Graph Neural Networks (GNNs) have been widely employed for feature representation learning in molecular graphs. Therefore, it is crucial to enhance the expressiveness of feature representation to ensure the effectiveness of GNNs. However, a significant portion of current research primarily focuses on the structural features within individual molecules, often overlooking the structural similarity between molecules, which is a crucial aspect encapsulating rich information on the relationship between molecular properties and structural characteristics. Thus, these approaches fail to capture the rich semantic information at the molecular structure level. To bridge this gap, we introduce the \textbf{Molecular Structural Similarity Motif GNN (MSSM-GNN)}, a novel molecular graph representation learning method that can capture structural similarity information among molecules from a global perspective. In particular, we propose a specially designed graph that leverages graph kernel algorithms to represent the similarity between molecules quantitatively. Subsequently, we employ GNNs to learn feature representations from molecular graphs, aiming to enhance the accuracy of property prediction by incorporating additional molecular representation information. Finally, through a series of experiments conducted on both small-scale and large-scale molecular datasets, we demonstrate that our model consistently outperforms eleven state-of-the-art baselines. The codes are available at https://github.com/yaoyao-yaoyao-cell/MSSM-GNN.
CVJul 17, 2023
Unbiased Image Synthesis via Manifold Guidance in Diffusion ModelsXingzhe Su, Daixi Jia, Fengge Wu et al.
Diffusion Models are a potent class of generative models capable of producing high-quality images. However, they often inadvertently favor certain data attributes, undermining the diversity of generated images. This issue is starkly apparent in skewed datasets like CelebA, where the initial dataset disproportionately favors females over males by 57.9%, this bias amplified in generated data where female representation outstrips males by 148%. In response, we propose a plug-and-play method named Manifold Guidance Sampling, which is also the first unsupervised method to mitigate bias issue in DDPMs. Leveraging the inherent structure of the data manifold, this method steers the sampling process towards a more uniform distribution, effectively dispersing the clustering of biased data. Without the need for modifying the existing model or additional training, it significantly mitigates data bias and enhances the quality and unbiasedness of the generated images.
LGMay 9, 2025Code
Learn to Think: Bootstrapping LLM Reasoning Capability Through Graph Representation LearningHang Gao, Chenhao Zhang, Tie Wang et al.
Large Language Models (LLMs) have achieved remarkable success across various domains. However, they still face significant challenges, including high computational costs for training and limitations in solving complex reasoning problems. Although existing methods have extended the reasoning capabilities of LLMs through structured paradigms, these approaches often rely on task-specific prompts and predefined reasoning processes, which constrain their flexibility and generalizability. To address these limitations, we propose a novel framework that leverages graph learning to enable more flexible and adaptive reasoning capabilities for LLMs. Specifically, this approach models the reasoning process of a problem as a graph and employs LLM-based graph learning to guide the adaptive generation of each reasoning step. To further enhance the adaptability of the model, we introduce a Graph Neural Network (GNN) module to perform representation learning on the generated reasoning process, enabling real-time adjustments to both the model and the prompt. Experimental results demonstrate that this method significantly improves reasoning performance across multiple tasks without requiring additional training or task-specific prompt design. Code can be found in https://github.com/zch65458525/L2T.
LGDec 11, 2024
Bootstrapping Heterogeneous Graph Representation Learning via Large Language Models: A Generalized ApproachHang Gao, Chenhao Zhang, Fengge Wu et al.
Graph representation learning methods are highly effective in handling complex non-Euclidean data by capturing intricate relationships and features within graph structures. However, traditional methods face challenges when dealing with heterogeneous graphs that contain various types of nodes and edges due to the diverse sources and complex nature of the data. Existing Heterogeneous Graph Neural Networks (HGNNs) have shown promising results but require prior knowledge of node and edge types and unified node feature formats, which limits their applicability. Recent advancements in graph representation learning using Large Language Models (LLMs) offer new solutions by integrating LLMs' data processing capabilities, enabling the alignment of various graph representations. Nevertheless, these methods often overlook heterogeneous graph data and require extensive preprocessing. To address these limitations, we propose a novel method that leverages the strengths of both LLM and GNN, allowing for the processing of graph data with any format and type of nodes and edges without the need for type information or special preprocessing. Our method employs LLM to automatically summarize and classify different data formats and types, aligns node features, and uses a specialized GNN for targeted learning, thus obtaining effective graph representations for downstream tasks. Theoretical analysis and experimental validation have demonstrated the effectiveness of our method.
LGMay 13, 2025
LLM Enhancers for GNNs: An Analysis from the Perspective of Causal Mechanism IdentificationHang Gao, Wenxuan Huang, Fengge Wu et al.
The use of large language models (LLMs) as feature enhancers to optimize node representations, which are then used as inputs for graph neural networks (GNNs), has shown significant potential in graph representation learning. However, the fundamental properties of this approach remain underexplored. To address this issue, we propose conducting a more in-depth analysis of this issue based on the interchange intervention method. First, we construct a synthetic graph dataset with controllable causal relationships, enabling precise manipulation of semantic relationships and causal modeling to provide data for analysis. Using this dataset, we conduct interchange interventions to examine the deeper properties of LLM enhancers and GNNs, uncovering their underlying logic and internal mechanisms. Building on the analytical results, we design a plug-and-play optimization module to improve the information transfer between LLM enhancers and GNNs. Experiments across multiple datasets and models validate the proposed module.
CVMay 31, 2023
Manifold Constraint Regularization for Remote Sensing Image GenerationXingzhe Su, Changwen Zheng, Wenwen Qiang et al.
Generative Adversarial Networks (GANs) have shown notable accomplishments in remote sensing domain. However, this paper reveals that their performance on remote sensing images falls short when compared to their impressive results with natural images. This study identifies a previously overlooked issue: GANs exhibit a heightened susceptibility to overfitting on remote sensing images.To address this challenge, this paper analyzes the characteristics of remote sensing images and proposes manifold constraint regularization, a novel approach that tackles overfitting of GANs on remote sensing images for the first time. Our method includes a new measure for evaluating the structure of the data manifold. Leveraging this measure, we propose the manifold constraint regularization term, which not only alleviates the overfitting problem, but also promotes alignment between the generated and real data manifolds, leading to enhanced quality in the generated images. The effectiveness and versatility of this method have been corroborated through extensive validation on various remote sensing datasets and GAN models. The proposed method not only enhances the quality of the generated images, reflected in a 3.13\% improvement in Frechet Inception Distance (FID) score, but also boosts the performance of the GANs on downstream tasks, evidenced by a 3.76\% increase in classification accuracy.
IVMay 24, 2019
A Research and Strategy of Remote Sensing Image Denoising AlgorithmsLing Li, Junxing Hu, Fengge Wu et al.
Most raw data download from satellites are useless, resulting in transmission waste, one solution is to process data directly on satellites, then only transmit the processed results to the ground. Image processing is the main data processing on satellites, in this paper, we focus on image denoising which is the basic image processing. There are many high-performance denoising approaches at present, however, most of them rely on advanced computing resources or rich images on the ground. Considering the limited computing resources of satellites and the characteristics of remote sensing images, we do some research on these high-performance ground image denoising approaches and compare them in simulation experiments to analyze whether they are suitable for satellites. According to the analysis results, we propose two feasible image denoising strategies for satellites based on satellite TianZhi-1.
CVMay 24, 2019
A Comparison and Strategy of Semantic Segmentation on Remote Sensing ImagesJunxing Hu, Ling Li, Yijun Lin et al.
In recent years, with the development of aerospace technology, we use more and more images captured by satellites to obtain information. But a large number of useless raw images, limited data storage resource and poor transmission capability on satellites hinder our use of valuable images. Therefore, it is necessary to deploy an on-orbit semantic segmentation model to filter out useless images before data transmission. In this paper, we present a detailed comparison on the recent deep learning models. Considering the computing environment of satellites, we compare methods from accuracy, parameters and resource consumption on the same public dataset. And we also analyze the relation between them. Based on experimental results, we further propose a viable on-orbit semantic segmentation strategy. It will be deployed on the TianZhi-2 satellite which supports deep learning methods and will be lunched soon.