Xinming Zhang

h-index33

10papers

585citations

Novelty52%

AI Score43

Ranked #53,688 of 194,257 authors (top 28%)#12,238 in LG (top 30%)

10 Papers

38.8LGFeb 16, 2023

GraphPrompt: Unifying Pre-Training and Downstream Tasks for Graph Neural Networks

Zemin Liu, Xingtong Yu, Yuan Fang et al.

Graphs can model complex relationships between objects, enabling a myriad of Web applications such as online page/article classification and social recommendation. While graph neural networks(GNNs) have emerged as a powerful tool for graph representation learning, in an end-to-end supervised setting, their performance heavily rely on a large amount of task-specific supervision. To reduce labeling requirement, the "pre-train, fine-tune" and "pre-train, prompt" paradigms have become increasingly common. In particular, prompting is a popular alternative to fine-tuning in natural language processing, which is designed to narrow the gap between pre-training and downstream objectives in a task-specific manner. However, existing study of prompting on graphs is still limited, lacking a universal treatment to appeal to different downstream tasks. In this paper, we propose GraphPrompt, a novel pre-training and prompting framework on graphs. GraphPrompt not only unifies pre-training and downstream tasks into a common task template, but also employs a learnable prompt to assist a downstream task in locating the most relevant knowledge from the pre-train model in a task-specific manner. Finally, we conduct extensive experiments on five public datasets to evaluate and analyze GraphPrompt.

8.4CVSep 16, 2023Code

Pixel Adapter: A Graph-Based Post-Processing Approach for Scene Text Image Super-Resolution

Wenyu Zhang, Xin Deng, Baojun Jia et al.

Current Scene text image super-resolution approaches primarily focus on extracting robust features, acquiring text information, and complex training strategies to generate super-resolution images. However, the upsampling module, which is crucial in the process of converting low-resolution images to high-resolution ones, has received little attention in existing works. To address this issue, we propose the Pixel Adapter Module (PAM) based on graph attention to address pixel distortion caused by upsampling. The PAM effectively captures local structural information by allowing each pixel to interact with its neighbors and update features. Unlike previous graph attention mechanisms, our approach achieves 2-3 orders of magnitude improvement in efficiency and memory utilization by eliminating the dependency on sparse adjacency matrices and introducing a sliding window approach for efficient parallel computation. Additionally, we introduce the MLP-based Sequential Residual Block (MSRB) for robust feature extraction from text images, and a Local Contour Awareness loss ($\mathcal{L}_{lca}$) to enhance the model's perception of details. Comprehensive experiments on TextZoom demonstrate that our proposed method generates high-quality super-resolution images, surpassing existing methods in recognition accuracy. For single-stage and multi-stage strategies, we achieved improvements of 0.7\% and 2.6\%, respectively, increasing the performance from 52.6\% and 53.7\% to 53.3\% and 56.3\%. The code is available at https://github.com/wenyu1009/RTSRN.

15.5LGFeb 7, 2023Code

Learning to Count Isomorphisms with Graph Neural Networks

Xingtong Yu, Zemin Liu, Yuan Fang et al.

Subgraph isomorphism counting is an important problem on graphs, as many graph-based tasks exploit recurring subgraph patterns. Classical methods usually boil down to a backtracking framework that needs to navigate a huge search space with prohibitive computational costs. Some recent studies resort to graph neural networks (GNNs) to learn a low-dimensional representation for both the query and input graphs, in order to predict the number of subgraph isomorphisms on the input graph. However, typical GNNs employ a node-centric message passing scheme that receives and aggregates messages on nodes, which is inadequate in complex structure matching for isomorphism counting. Moreover, on an input graph, the space of possible query graphs is enormous, and different parts of the input graph will be triggered to match different queries. Thus, expecting a fixed representation of the input graph to match diversely structured query graphs is unrealistic. In this paper, we propose a novel GNN called Count-GNN for subgraph isomorphism counting, to deal with the above challenges. At the edge level, given that an edge is an atomic unit of encoding graph structures, we propose an edge-centric message passing scheme, where messages on edges are propagated and aggregated based on the edge adjacency to preserve fine-grained structural information. At the graph level, we modulate the input graph representation conditioned on the query, so that the input graph can be adapted to each query individually to improve their matching. Finally, we conduct extensive experiments on a number of benchmark datasets to demonstrate the superior performance of Count-GNN.

11.9CLNov 28, 2023Code

MultiGPrompt for Multi-Task Pre-Training and Prompting on Graphs

Xingtong Yu, Chang Zhou, Yuan Fang et al.

Graphs can inherently model interconnected objects on the Web, thereby facilitating a series of Web applications, such as web analyzing and content recommendation. Recently, Graph Neural Networks (GNNs) have emerged as a mainstream technique for graph representation learning. However, their efficacy within an end-to-end supervised framework is significantly tied to the availabilityof task-specific labels. To mitigate labeling costs and enhance robustness in few-shot settings, pre-training on self-supervised tasks has emerged as a promising method, while prompting has been proposed to further narrow the objective gap between pretext and downstream tasks. Although there has been some initial exploration of prompt-based learning on graphs, they primarily leverage a single pretext task, resulting in a limited subset of general knowledge that could be learned from the pre-training data. Hence, in this paper, we propose MultiGPrompt, a novel multi-task pre-training and prompting framework to exploit multiple pretext tasks for more comprehensive pre-trained knowledge. First, in pre-training, we design a set of pretext tokens to synergize multiple pretext tasks. Second, we propose a dual-prompt mechanism consisting of composed and open prompts to leverage task-specific and global pre-training knowledge, to guide downstream tasks in few-shot settings. Finally, we conduct extensive experiments on six public datasets to evaluate and analyze MultiGPrompt.

25.0LGNov 26, 2023Code

Generalized Graph Prompt: Toward a Unification of Pre-Training and Downstream Tasks on Graphs

Xingtong Yu, Zhenghao Liu, Yuan Fang et al.

Graph neural networks have emerged as a powerful tool for graph representation learning, but their performance heavily relies on abundant task-specific supervision. To reduce labeling requirement, the "pre-train, prompt" paradigms have become increasingly common. However, existing study of prompting on graphs is limited, lacking a universal treatment to appeal to different downstream tasks. In this paper, we propose GraphPrompt, a novel pre-training and prompting framework on graphs. GraphPrompt not only unifies pre-training and downstream tasks into a common task template but also employs a learnable prompt to assist a downstream task in locating the most relevant knowledge from the pre-trained model in a task-specific manner. To further enhance GraphPrompt in these two stages, we extend it into GraphPrompt+ with two major enhancements. First, we generalize several popular graph pre-training tasks beyond simple link prediction to broaden the compatibility with our task template. Second, we propose a more generalized prompt design that incorporates a series of prompt vectors within every layer of the pre-trained graph encoder, in order to capitalize on the hierarchical information across different layers beyond just the readout layer. Finally, we conduct extensive experiments on five public datasets to evaluate and analyze GraphPrompt and GraphPrompt+.

16.0CVMay 21, 2022

PointVector: A Vector Representation In Point Cloud Analysis

Xin Deng, WenYu Zhang, Qing Ding et al.

In point cloud analysis, point-based methods have rapidly developed in recent years. These methods have recently focused on concise MLP structures, such as PointNeXt, which have demonstrated competitiveness with Convolutional and Transformer structures. However, standard MLPs are limited in their ability to extract local features effectively. To address this limitation, we propose a Vector-oriented Point Set Abstraction that can aggregate neighboring features through higher-dimensional vectors. To facilitate network optimization, we construct a transformation from scalar to vector using independent angles based on 3D vector rotations. Finally, we develop a PointVector model that follows the structure of PointNeXt. Our experimental results demonstrate that PointVector achieves state-of-the-art performance $\textbf{72.3\% mIOU}$ on the S3DIS Area 5 and $\textbf{78.4\% mIOU}$ on the S3DIS (6-fold cross-validation) with only $\textbf{58\%}$ model parameters of PointNeXt. We hope our work will help the exploration of concise and effective feature representations. The code will be released soon.

26.3LGDec 4, 2023

HGPROMPT: Bridging Homogeneous and Heterogeneous Graphs for Few-shot Prompt Learning

Xingtong Yu, Yuan Fang, Zemin Liu et al.

Graph neural networks (GNNs) and heterogeneous graph neural networks (HGNNs) are prominent techniques for homogeneous and heterogeneous graph representation learning, yet their performance in an end-to-end supervised framework greatly depends on the availability of task-specific supervision. To reduce the labeling cost, pre-training on self-supervised pretext tasks has become a popular paradigm,but there is often a gap between the pre-trained model and downstream tasks, stemming from the divergence in their objectives. To bridge the gap, prompt learning has risen as a promising direction especially in few-shot settings, without the need to fully fine-tune the pre-trained model. While there has been some early exploration of prompt-based learning on graphs, they primarily deal with homogeneous graphs, ignoring the heterogeneous graphs that are prevalent in downstream applications. In this paper, we propose HGPROMPT, a novel pre-training and prompting framework to unify not only pre-training and downstream tasks but also homogeneous and heterogeneous graphs via a dual-template design. Moreover, we propose dual-prompt in HGPROMPT to assist a downstream task in locating the most relevant prior to bridge the gaps caused by not only feature variations but also heterogeneity differences across tasks. Finally, we thoroughly evaluate and analyze HGPROMPT through extensive experiments on three public datasets.

4.1LGAug 31, 2025

Task-Aware Adaptive Modulation: A Replay-Free and Resource-Efficient Approach For Continual Graph Learning

Jingtao Liu, Xinming Zhang

Continual Graph Learning(CGL)focuses on acquiring new knowledge while retaining previously learned information, essential for real-world graph applications. Current methods grapple with two main issues:1) The Stability-Plasticity Dilemma: Replay-based methods often create an imbalance between the Dilemma, while incurring significant storage costs.2) The Resource-Heavy Pre-training: Leading replay-free methods critically depend on extensively pre-trained backbones, this reliance imposes a substantial resource burden.In this paper, we argue that the key to overcoming these challenges lies not in replaying data or fine-tuning the entire network, but in dynamically modulating the internal computational flow of a frozen backbone. We posit that lightweight, task-specific modules can effectively steer a GNN's reasoning process. Motivated by this insight, we propose Task-Aware Adaptive Modulation(TAAM), a replay-free, resource-efficient approach that charts a new path for navigating the stability-plasticity dilemma. TAAM's core is its Neural Synapse Modulators(NSM), which are trained and then frozen for each task to store expert knowledge. A pivotal prototype-guided strategy governs these modulators: 1) For training, it initializes a new NSM by deep-copying from a similar past modulator to boost knowledge transfer. 2) For inference, it selects the most relevant frozen NSM for each task. These NSMs insert into a frozen GNN backbone to perform fine-grained, node-attentive modulation of its internal flow-different from the static perturbations of prior methods. Extensive experiments show that TAAM comprehensively outperforms state-of-the-art methods across six GCIL benchmark datasets. The code will be released upon acceptance of the paper.

4.6LGOct 22, 2024

Rethinking Soft Actor-Critic in High-Dimensional Action Spaces: The Cost of Ignoring Distribution Shift

Yanjun Chen, Xinming Zhang, Xianghui Wang et al.

Soft Actor-Critic algorithm is widely recognized for its robust performance across a range of deep reinforcement learning tasks, where it leverages the tanh transformation to constrain actions within bounded limits. However, this transformation induces a distribution shift, distorting the original Gaussian action distribution and potentially leading the policy to select suboptimal actions, particularly in high-dimensional action spaces. In this paper, we conduct a comprehensive theoretical and empirical analysis of this distribution shift, deriving the precise probability density function (PDF) for actions following the tanh transformation to clarify the misalignment introduced between the transformed distribution's mode and the intended action output. We substantiate these theoretical insights through extensive experiments on high-dimensional tasks within the HumanoidBench benchmark. Our findings indicate that accounting for this distribution shift substantially enhances SAC's performance, resulting in notable improvements in cumulative rewards, sample efficiency, and reliability across tasks. These results underscore a critical consideration for SAC and similar algorithms: addressing transformation-induced distribution shifts is essential to optimizing policy effectiveness in high-dimensional deep reinforcement learning environments, thereby expanding the robustness and applicability of SAC in complex control tasks.

2.2RONov 2, 2020

NEARL: Non-Explicit Action Reinforcement Learning for Robotic Control

Nan Lin, Yuxuan Li, Yujun Zhu et al.

Traditionally, reinforcement learning methods predict the next action based on the current state. However, in many situations, directly applying actions to control systems or robots is dangerous and may lead to unexpected behaviors because action is rather low-level. In this paper, we propose a novel hierarchical reinforcement learning framework without explicit action. Our meta policy tries to manipulate the next optimal state and actual action is produced by the inverse dynamics model. To stabilize the training process, we integrate adversarial learning and information bottleneck into our framework. Under our framework, widely available state-only demonstrations can be exploited effectively for imitation learning. Also, prior knowledge and constraints can be applied to meta policy. We test our algorithm in simulation tasks and its combination with imitation learning. The experimental results show the reliability and robustness of our algorithms.