Fengzhi Li

LG
4papers
77citations
Novelty55%
AI Score50

4 Papers

LGAug 25, 2024
LLMs as Zero-shot Graph Learners: Alignment of GNN Representations with LLM Token Embeddings

Duo Wang, Yuan Zuo, Fengzhi Li et al.

Zero-shot graph machine learning, especially with graph neural networks (GNNs), has garnered significant interest due to the challenge of scarce labeled data. While methods like self-supervised learning and graph prompt learning have been extensively explored, they often rely on fine-tuning with task-specific labels, limiting their effectiveness in zero-shot scenarios. Inspired by the zero-shot capabilities of instruction-fine-tuned large language models (LLMs), we introduce a novel framework named Token Embedding-Aligned Graph Language Model (TEA-GLM) that leverages LLMs as cross-dataset and cross-task zero-shot learners for graph machine learning. Concretely, we pretrain a GNN, aligning its representations with token embeddings of an LLM. We then train a linear projector that transforms the GNN's representations into a fixed number of graph token embeddings without tuning the LLM. A unified instruction is designed for various graph tasks at different levels, such as node classification (node-level) and link prediction (edge-level). These design choices collectively enhance our method's effectiveness in zero-shot learning, setting it apart from existing methods. Experiments show that our graph token embeddings help the LLM predictor achieve state-of-the-art performance on unseen datasets and tasks compared to other methods using LLMs as predictors.

75.8AIMay 4Code
Strategy-Aware Optimization Modeling with Reasoning LLMs

Ruiqing Zhao, Fengzhi Li, Yuan Zuo et al.

Large language models (LLMs) can generate syntactically valid optimization programs, yet often struggle to reliably choose an effective modeling strategy, leading to incorrect formulations and inefficient solver behavior. We propose SAGE, a strategy-aware framework that makes Modeling Strategy explicit in both data construction and post-training. SAGE builds a solver-verified multi-strategy dataset and trains a student model with supervised fine-tuning followed by Segment-Weighted GRPO using a composite reward over format compliance, correctness, and solver efficiency. Across eight benchmarks spanning synthetic and real-world settings, SAGE improves average pass@1 from 72.7 to 80.3 over the strongest open-source baseline. With multiple generations, SAGE discovers more distinct correct formulations and improves component-level diversity at pass@16 by 19-29%. At the largest scale, SAGE produces more compact constraint systems with 14.2% fewer constraints than the baseline, consistent with solver-efficient modeling. Overall, these results show that making Modeling Strategy explicit improves automated optimization modeling. Code is available at https://github.com/rachhhhing/SAGE.

LGMar 3
Beyond One-Size-Fits-All: Adaptive Subgraph Denoising for Zero-Shot Graph Learning with Large Language Models

Fengzhi Li, Liang Zhang, Yuan Zuo et al.

Graph-based tasks in the zero-shot setting remain a significant challenge due to data scarcity and the inability of traditional Graph Neural Networks (GNNs) to generalize to unseen domains or label spaces. While recent advancements have transitioned toward leveraging Large Language Models (LLMs) as predictors to enhance GNNs, these methods often suffer from cross-modal alignment issues. A recent paradigm (i.e., Graph-R1) overcomes the aforementioned architectural dependencies by adopting a purely text-based format and utilizing LLM-based graph reasoning, showing improved zero-shot generalization. However, it employs a task-agnostic, one-size-fits-all subgraph extraction strategy, which inevitably introduces significant structural noise--irrelevant neighbors and edges--that distorts the LLMs' receptive field and leads to suboptimal predictions. To address this limitation, we introduce GraphSSR, a novel framework designed for adaptive subgraph extraction and denoising in zero-shot LLM-based graph reasoning. Specifically, we propose the SSR pipeline, which dynamically tailors subgraph extraction to specific contexts through a "Sample-Select-Reason" process, enabling the model to autonomously filter out task-irrelevant neighbors and overcome the one-size-fits-all issue. To internalize this capability, we develop SSR-SFT, a data synthesis strategy that generates high-quality SSR-style graph reasoning traces for supervised fine-tuning of LLMs. Furthermore, we propose SSR-RL, a two-stage reinforcement learning framework that explicitly regulates sampling and selection operations within the proposed SSR pipeline designed for adaptive subgraph denoising. By incorporating Authenticity-Reinforced and Denoising-Reinforced RL, we guide the model to achieve accurate predictions using parsimonious, denoised subgraphs for reasoning.

40.0DCApr 12
CIR: Lightweight Container Image for Cross-Platform Deployment

Fengzhi Li, Xiaohui Peng, Qingru Xu et al.

In modern cloud and heterogeneous distributed infrastructures, container images are widely used as the deployment unit for machine learning applications. An image bundles the application with its entire platform-specific execution environment and can be directly launched into a container instance. However, this approach forces developers to build and maintain separate images for each target deployment platform. This limitation is particularly evident for widely used interpreted languages such as Python and R in data analytics and machine learning, where application code is inherently cross-platform, yet the runtime dependencies are highly platform-specific. With emerging computing paradigms such as sky computing and edge computing, which demand seamless workload migration and cross-platform deployment, traditional images not only introduce inefficiencies in storage and network usage, but also impose substantial burdens on developers, who must repeatedly craft and manage platform-specific builds. To address these challenges, we propose a lazy-build approach that defers platform-specific construction to the deployment stage, thus keeping the image itself cross-platform. To enable this, we introduce a new image format, CIR (Container Intermediate Representation), together with its pre-builder and lazy-builder. CIR targets interpreted-language applications and only stores the identifiers of the application's direct dependencies, leaving platform adaptation to the lazy-builder, which at deployment time assembles the actual dependencies into runnable containers. A single CIR can therefore be deployed across heterogeneous platforms while reducing image size by 95% compared to conventional images that bundle all dependencies. In our evaluation, CIR reduces deployment time by 40-60% compared with pre-built images, outperforming state-of-the-art systems such as Docker, Buildah, and Apptainer.