LGFeb 2, 2023Code
Predicting the Silent Majority on Graphs: Knowledge Transferable Graph Neural NetworkWendong Bi, Bingbing Xu, Xiaoqian Sun et al.
Graphs consisting of vocal nodes ("the vocal minority") and silent nodes ("the silent majority"), namely VS-Graph, are ubiquitous in the real world. The vocal nodes tend to have abundant features and labels. In contrast, silent nodes only have incomplete features and rare labels, e.g., the description and political tendency of politicians (vocal) are abundant while not for ordinary people (silent) on the twitter's social network. Predicting the silent majority remains a crucial yet challenging problem. However, most existing message-passing based GNNs assume that all nodes belong to the same domain, without considering the missing features and distribution-shift between domains, leading to poor ability to deal with VS-Graph. To combat the above challenges, we propose Knowledge Transferable Graph Neural Network (KT-GNN), which models distribution shifts during message passing and representation learning by transferring knowledge from vocal nodes to silent nodes. Specifically, we design the domain-adapted "feature completion and message passing mechanism" for node representation learning while preserving domain difference. And a knowledge transferable classifier based on KL-divergence is followed. Comprehensive experiments on real-world scenarios (i.e., company financial risk assessment and political elections) demonstrate the superior performance of our method. Our source code has been open sourced.
SIOct 19, 2022
DyTed: Disentangled Representation Learning for Discrete-time Dynamic GraphKaike Zhang, Qi Cao, Gaolin Fang et al. · baidu, tencent-ai
Unsupervised representation learning for dynamic graphs has attracted a lot of research attention in recent years. Compared with static graph, the dynamic graph is a comprehensive embodiment of both the intrinsic stable characteristics of nodes and the time-related dynamic preference. However, existing methods generally mix these two types of information into a single representation space, which may lead to poor explanation, less robustness, and a limited ability when applied to different downstream tasks. To solve the above problems, in this paper, we propose a novel disenTangled representation learning framework for discrete-time Dynamic graphs, namely DyTed. We specially design a temporal-clips contrastive learning task together with a structure contrastive learning to effectively identify the time-invariant and time-varying representations respectively. To further enhance the disentanglement of these two types of representation, we propose a disentanglement-aware discriminator under an adversarial learning framework from the perspective of information theory. Extensive experiments on Tencent and five commonly used public datasets demonstrate that DyTed, as a general framework that can be applied to existing methods, achieves state-of-the-art performance on various downstream tasks, as well as be more robust against noise.
CLMar 15Code
Inference-time Alignment in Continuous SpaceYige Yuan, Teng Xiao, Li Yunfan et al.
Aligning large language models with human feedback at inference time has received increasing attention due to its flexibility. Existing methods rely on generating multiple responses from the base policy for search using a reward model, which can be considered as searching in a discrete response space. However, these methods struggle to explore informative candidates when the base policy is weak or the candidate set is small, resulting in limited effectiveness. In this paper, to address this problem, we propose Simple Energy Adaptation ($\textbf{SEA}$), a simple yet effective algorithm for inference-time alignment. In contrast to expensive search over the discrete space, SEA directly adapts original responses from the base policy toward the optimal one via gradient-based sampling in continuous latent space. Specifically, SEA formulates inference as an iterative optimization procedure on an energy function over actions in the continuous space defined by the optimal policy, enabling simple and effective alignment. For instance, despite its simplicity, SEA outperforms the second-best baseline with a relative improvement of up to $ \textbf{77.51%}$ on AdvBench and $\textbf{16.36%}$ on MATH. Our code is publicly available at https://github.com/yuanyige/sea
LGJan 31, 2023
Company-as-Tribe: Company Financial Risk Assessment on Tribe-Style Graph with Hierarchical Graph Neural NetworksWendong Bi, Bingbing Xu, Xiaoqian Sun et al.
Company financial risk is ubiquitous and early risk assessment for listed companies can avoid considerable losses. Traditional methods mainly focus on the financial statements of companies and lack the complex relationships among them. However, the financial statements are often biased and lagged, making it difficult to identify risks accurately and timely. To address the challenges, we redefine the problem as \textbf{company financial risk assessment on tribe-style graph} by taking each listed company and its shareholders as a tribe and leveraging financial news to build inter-tribe connections. Such tribe-style graphs present different patterns to distinguish risky companies from normal ones. However, most nodes in the tribe-style graph lack attributes, making it difficult to directly adopt existing graph learning methods (e.g., Graph Neural Networks(GNNs)). In this paper, we propose a novel Hierarchical Graph Neural Network (TH-GNN) for Tribe-style graphs via two levels, with the first level to encode the structure pattern of the tribes with contrastive learning, and the second level to diffuse information based on the inter-tribe relations, achieving effective and efficient risk assessment. Extensive experiments on the real-world company dataset show that our method achieves significant improvements on financial risk assessment over previous competing methods. Also, the extensive ablation studies and visualization comprehensively show the effectiveness of our method.
LGNov 24, 2023
TEA: Test-time Energy AdaptationYige Yuan, Bingbing Xu, Liang Hou et al.
Test-time adaptation (TTA) aims to improve model generalizability when test data diverges from training distribution, offering the distinct advantage of not requiring access to training data and processes, especially valuable in the context of large pre-trained models. However, current TTA methods fail to address the fundamental issue: covariate shift, i.e., the decreased generalizability can be attributed to the model's reliance on the marginal distribution of the training data, which may impair model calibration and introduce confirmation bias. To address this, we propose a novel energy-based perspective, enhancing the model's perception of target data distributions without requiring access to training data or processes. Building on this perspective, we introduce $\textbf{T}$est-time $\textbf{E}$nergy $\textbf{A}$daptation ($\textbf{TEA}$), which transforms the trained classifier into an energy-based model and aligns the model's distribution with the test data's, enhancing its ability to perceive test distributions and thus improving overall generalizability. Extensive experiments across multiple tasks, benchmarks and architectures demonstrate TEA's superior generalization performance against state-of-the-art methods. Further in-depth analyses reveal that TEA can equip the model with a comprehensive perception of test distribution, ultimately paving the way toward improved generalization and calibration.
LGAug 18, 2023
Bridged-GNN: Knowledge Bridge Learning for Effective Knowledge TransferWendong Bi, Xueqi Cheng, Bingbing Xu et al.
The data-hungry problem, characterized by insufficiency and low-quality of data, poses obstacles for deep learning models. Transfer learning has been a feasible way to transfer knowledge from high-quality external data of source domains to limited data of target domains, which follows a domain-level knowledge transfer to learn a shared posterior distribution. However, they are usually built on strong assumptions, e.g., the domain invariant posterior distribution, which is usually unsatisfied and may introduce noises, resulting in poor generalization ability on target domains. Inspired by Graph Neural Networks (GNNs) that aggregate information from neighboring nodes, we redefine the paradigm as learning a knowledge-enhanced posterior distribution for target domains, namely Knowledge Bridge Learning (KBL). KBL first learns the scope of knowledge transfer by constructing a Bridged-Graph that connects knowledgeable samples to each target sample and then performs sample-wise knowledge transfer via GNNs.KBL is free from strong assumptions and is robust to noises in the source data. Guided by KBL, we propose the Bridged-GNN} including an Adaptive Knowledge Retrieval module to build Bridged-Graph and a Graph Knowledge Transfer module. Comprehensive experiments on both un-relational and relational data-hungry scenarios demonstrate the significant improvements of Bridged-GNN compared with SOTA methods
LGNov 20, 2022
Towards Generalizable Graph Contrastive Learning: An Information Theory PerspectiveYige Yuan, Bingbing Xu, Huawei Shen et al.
Graph contrastive learning (GCL) emerges as the most representative approach for graph representation learning, which leverages the principle of maximizing mutual information (InfoMax) to learn node representations applied in downstream tasks. To explore better generalization from GCL to downstream tasks, previous methods heuristically define data augmentation or pretext tasks. However, the generalization ability of GCL and its theoretical principle are still less reported. In this paper, we first propose a metric named GCL-GE for GCL generalization ability. Considering the intractability of the metric due to the agnostic downstream task, we theoretically prove a mutual information upper bound for it from an information-theoretic perspective. Guided by the bound, we design a GCL framework named InfoAdv with enhanced generalization ability, which jointly optimizes the generalization metric and InfoMax to strike the right balance between pretext task fitting and the generalization ability on downstream tasks. We empirically validate our theoretical findings on a number of representative benchmarks, and experimental results demonstrate that our model achieves state-of-the-art performance.
CLMar 15Code
Incentivizing Strong Reasoning from Weak SupervisionYige Yuan, Teng Xiao, Shuchang Tao et al.
Large language models (LLMs) have demonstrated impressive performance on reasoning-intensive tasks, but enhancing their reasoning abilities typically relies on either reinforcement learning (RL) with verifiable signals or supervised fine-tuning (SFT) with high-quality long chain-of-thought (CoT) demonstrations, both of which are expensive. In this paper, we study a novel problem of incentivizing the reasoning capacity of LLMs without expensive high-quality demonstrations and reinforcement learning. We investigate whether the reasoning capabilities of LLMs can be effectively incentivized via supervision from significantly weaker models. We further analyze when and why such weak supervision succeeds in eliciting reasoning abilities in stronger models. Our findings show that supervision from significantly weaker reasoners can substantially improve student reasoning performance, recovering close to 94% of the gains of expensive RL at a fraction of the cost. Experiments across diverse benchmarks and model architectures demonstrate that weak reasoners can effectively incentivize reasoning in stronger student models, consistently improving performance across a wide range of reasoning tasks. Our results suggest that this simple weak-to-strong paradigm is a promising and generalizable alternative to costly methods for incentivizing strong reasoning capabilities at inference-time in LLMs. The code is publicly available at https://github.com/yuanyige/w2sr.
LGMar 22, 2022
Twin Weisfeiler-Lehman: High Expressive GNNs for Graph ClassificationZhaohui Wang, Qi Cao, Huawei Shen et al.
The expressive power of message passing GNNs is upper-bounded by Weisfeiler-Lehman (WL) test. To achieve high expressive GNNs beyond WL test, we propose a novel graph isomorphism test method, namely Twin-WL, which simultaneously passes node labels and node identities rather than only passes node label as WL. The identity-passing mechanism encodes complete structure information of rooted subgraph, and thus Twin-WL can offer extra power beyond WL at distinguishing graph structures. Based on Twin-WL, we implement two Twin-GNNs for graph classification via defining readout function over rooted subgraph: one simply readouts the size of rooted subgraph and the other readouts rich structure information of subgraph following a GNN-style. We prove that the two Twin-GNNs both have higher expressive power than traditional message passing GNNs. Experiments also demonstrate the Twin-GNNs significantly outperform state-of-the-art methods at the task of graph classification.
LGNov 16, 2022
Hierarchical Estimation for Effective and Efficient Sampling Graph Neural NetworkYang Li, Bingbing Xu, Qi Cao et al.
Improving the scalability of GNNs is critical for large graphs. Existing methods leverage three sampling paradigms including node-wise, layer-wise and subgraph sampling, then design unbiased estimator for scalability. However, the high variance still severely hinders GNNs' performance. On account that previous studies either lacks variance analysis or only focus on a particular sampling paradigm, we firstly propose an unified node sampling variance analysis framework and analyze the core challenge "circular dependency" for deriving the minimum variance sampler, i. e., sampling probability depends on node embeddings while node embeddings can not be calculated until sampling is finished. Existing studies either ignore the node embeddings or introduce external parameters, resulting in the lack of a both efficient and effective variance reduction methods. Therefore, we propose the \textbf{H}ierarchical \textbf{E}stimation based \textbf{S}ampling GNN (HE-SGNN) with first level estimating the node embeddings in sampling probability to break circular dependency, and second level employing sampling GNN operator to estimate the nodes' representations on the entire graph. Considering the technical difference, we propose different first level estimator, i.e., a time series simulation for layer-wise sampling and a feature based simulation for subgraph sampling. The experimental results on seven representative datasets demonstrate the effectiveness and efficiency of our method.
LGApr 24
Chain-of-Memory: Lightweight Memory Construction with Dynamic Evolution for LLM AgentsXiucheng Xu, Bingbing Xu, Xueyun Tian et al.
External memory systems are pivotal for enabling Large Language Model (LLM) agents to maintain persistent knowledge and perform long-horizon decision-making. Existing paradigms typically follow a two-stage process: computationally expensive memory construction (e.g., structuring data into graphs) followed by naive retrieval-augmented generation. However, our empirical analysis reveals two fundamental limitations: complex construction incurs high costs with marginal performance gains, and simple context concatenation fails to bridge the gap between retrieval recall and reasoning accuracy. To address these challenges, we propose CoM (Chain-of-Memory), a novel framework that advocates for a paradigm shift toward lightweight construction paired with sophisticated utilization. CoM introduces a Chain-of-Memory mechanism that organizes retrieved fragments into coherent inference paths through dynamic evolution, utilizing adaptive truncation to prune irrelevant noise. Extensive experiments on the LongMemEval and LoCoMo benchmarks demonstrate that CoM outperforms strong baselines with accuracy gains of 7.5%-10.4%, while drastically reducing computational overhead to approximately 2.7% of token consumption and 6.0% of latency compared to complex memory architectures.
CLMay 23
Know You Before You Speak: User-State Modeling for LLM Personalization in Multi-Turn ConversationJiani Luo, Xiaoyan Zhao, Yang Zhang et al.
Personalized dialogue requires more than recalling explicit user histories: systems also need to infer hidden user states that evolve through interaction and shape appropriate response strategies. Existing memory- and profile-based methods primarily reuse observable user information, offering limited support for modeling user-state dynamics or selecting actions based on how they shape future user states. We propose PUMA (Prospective User-state Modeling for Action selection), a framework grounded in the Free Energy Principle (FEP) that formulates personalization as decision-making under partial observability, centered on an explicit user state model that captures latent user states and their action-conditioned dynamics. At each turn, PUMA maintains a belief over the user's hidden state, refines the user state model for observation generation and action-conditioned state transition, and selects dialogue actions by minimizing expected free energy, balancing epistemic and pragmatic objectives under a unified criterion. This formulation shifts personalization from passive memory retrieval to model-based decision-making over user evolution. We instantiate PUMA on healthcare-oriented counseling and motivational interviewing benchmarks with latent state annotations for rigorous evaluation. Experiments show that PUMA improves long-horizon dialogue outcomes while maintaining strong response quality, and a cross-dataset study demonstrates more reliable user-state estimation and next-state prediction.
CLJan 8
Learning from Mistakes: Negative Reasoning Samples Enhance Out-of-Domain GeneralizationXueyun Tian, Minghua Ma, Bingbing Xu et al.
Supervised fine-tuning (SFT) on chain-of-thought (CoT) trajectories demonstrations is a common approach for enabling reasoning in large language models. Standard practices typically only retain trajectories with correct final answers (positives) while ignoring the rest (negatives). We argue that this paradigm discards substantial supervision and exacerbates overfitting, limiting out-of-domain (OOD) generalization. Specifically, we surprisingly find that incorporating negative trajectories into SFT yields substantial OOD generalization gains over positive-only training, as these trajectories often retain valid intermediate reasoning despite incorrect final answers. To understand this effect in depth, we systematically analyze data, training dynamics, and inference behavior, identifying 22 recurring patterns in negative chains that serve a dual role: they moderate loss descent to mitigate overfitting during training and boost policy entropy by 35.67% during inference to facilitate exploration. Motivated by these observations, we further propose Gain-based LOss Weighting (GLOW), an adaptive, sample-aware scheme that exploits such distinctive training dynamics by rescaling per-sample loss based on inter-epoch progress. Empirically, GLOW efficiently leverages unfiltered trajectories, yielding a 5.51% OOD gain over positive-only SFT on Qwen2.5-7B and boosting MMLU from 72.82% to 76.47% as an RL initialization.
AIJan 16
Do We Always Need Query-Level Workflows? Rethinking Agentic Workflow Generation for Multi-Agent SystemsZixu Wang, Bingbing Xu, Yige Yuan et al.
Multi-Agent Systems (MAS) built on large language models typically solve complex tasks by coordinating multiple agents through workflows. Existing approaches generates workflows either at task level or query level, but their relative costs and benefits remain unclear. After rethinking and empirical analyses, we show that query-level workflow generation is not always necessary, since a small set of top-K best task-level workflows together already covers equivalent or even more queries. We further find that exhaustive execution-based task-level evaluation is both extremely token-costly and frequently unreliable. Inspired by the idea of self-evolution and generative reward modeling, we propose a low-cost task-level generation framework \textbf{SCALE}, which means \underline{\textbf{S}}elf prediction of the optimizer with few shot \underline{\textbf{CAL}}ibration for \underline{\textbf{E}}valuation instead of full validation execution. Extensive experiments demonstrate that \textbf{SCALE} maintains competitive performance, with an average degradation of just 0.61\% compared to existing approach across multiple datasets, while cutting overall token usage by up to 83\%.
NIMar 19
iSatCR: Graph-Empowered Joint Onboard Computing and Routing for LEO Data DeliveryJiangtao Luo, Bingbing Xu, Shaohua Xia et al.
Sending massive Earth observation data produced by low Earth orbit (LEO) satellites back to the ground for processing consumes a large amount of on-orbit bandwidth and exacerbates the space-to-ground link bottleneck. Most prior work has concentrated on optimizing the routing of raw data within the constellation, yet cannot cope with the surge in data volume. Recently, advances in onboard computing have made it possible to process data in situ, thus significantly reducing the data volume to be transmitted. In this paper, we present iSatCR, a distributed graph-based approach that jointly optimizes onboard computing and routing to boost transmission efficiency. Within iSatCR, we design a novel graph embedding utilizing shifted feature aggregation and distributed message passing to capture satellite states, and then propose a distributed graph-based deep reinforcement learning algorithm that derives joint computing-routing strategies under constrained on-board storage to handle the complexity and dynamics of LEO networks. Extensive experiments show iSatCR outperforms baselines, particularly under high load.
LGFeb 1, 2024Code
Graph Domain Adaptation: Challenges, Progress and ProspectsBoshen Shi, Yongqing Wang, Fangda Guo et al.
As graph representation learning often suffers from label scarcity problems in real-world applications, researchers have proposed graph domain adaptation (GDA) as an effective knowledge-transfer paradigm across graphs. In particular, to enhance model performance on target graphs with specific tasks, GDA introduces a bunch of task-related graphs as source graphs and adapts the knowledge learnt from source graphs to the target graphs. Since GDA combines the advantages of graph representation learning and domain adaptation, it has become a promising direction of transfer learning on graphs and has attracted an increasing amount of research interest in recent years. In this paper, we comprehensively overview the studies of GDA and present a detailed survey of recent advances. Specifically, we outline the research status and challenges, propose a taxonomy, introduce the details of representative works, and discuss the prospects. To the best of our knowledge, this paper is the first survey for graph domain adaptation. A detailed paper list is available at https://github.com/Skyorca/Awesome-Graph-Domain-Adaptation-Papers.
AIJan 12
Beyond Entangled Planning: Task-Decoupled Planning for Long-Horizon AgentsYunfan Li, Bingbing Xu, Xueyun Tian et al.
Recent advances in large language models (LLMs) have enabled agents to autonomously execute complex, long-horizon tasks, yet planning remains a primary bottleneck for reliable task execution. Existing methods typically fall into two paradigms: step-wise planning, which is reactive but often short-sighted; and one-shot planning, which generates a complete plan upfront yet is brittle to execution errors. Crucially, both paradigms suffer from entangled contexts, where the agent must reason over a monolithic history spanning multiple sub-tasks. This entanglement increases cognitive load and lets local errors propagate across otherwise independent decisions, making recovery computationally expensive. To address this, we propose Task-Decoupled Planning (TDP), a training-free framework that replaces entangled reasoning with task decoupling. TDP decomposes tasks into a directed acyclic graph (DAG) of sub-goals via a Supervisor. Using a Planner and Executor with scoped contexts, TDP confines reasoning and replanning to the active sub-task. This isolation prevents error propagation and corrects deviations locally without disrupting the workflow. Results on TravelPlanner, ScienceWorld, and HotpotQA show that TDP outperforms strong baselines while reducing token consumption by up to 82%, demonstrating that sub-task decoupling improves both robustness and efficiency for long-horizon agents.
AIApr 9
Towards Knowledgeable Deep Research: Framework and BenchmarkWenxuan Liu, Zixuan Li, Bai Long et al.
Deep Research (DR) requires LLM agents to autonomously perform multi-step information seeking, processing, and reasoning to generate comprehensive reports. In contrast to existing studies that mainly focus on unstructured web content, a more challenging DR task should additionally utilize structured knowledge to provide a solid data foundation, facilitate quantitative computation, and lead to in-depth analyses. In this paper, we refer to this novel task as Knowledgeable Deep Research (KDR), which requires DR agents to generate reports with both structured and unstructured knowledge. Furthermore, we propose the Hybrid Knowledge Analysis framework (HKA), a multi-agent architecture that reasons over both kinds of knowledge and integrates the texts, figures, and tables into coherent multimodal reports. The key design is the Structured Knowledge Analyzer, which utilizes both coding and vision-language models to produce figures, tables, and corresponding insights. To support systematic evaluation, we construct KDR-Bench, which covers 9 domains, includes 41 expert-level questions, and incorporates a large number of structured knowledge resources (e.g., 1,252 tables). We further annotate the main conclusions and key points for each question and propose three categories of evaluation metrics including general-purpose, knowledge-centric, and vision-enhanced ones. Experimental results demonstrate that HKA consistently outperforms most existing DR agents on general-purpose and knowledge-centric metrics, and even surpasses the Gemini DR agent on vision-enhanced metrics, highlighting its effectiveness in deep, structure-aware knowledge analysis. Finally, we hope this work can serve as a new foundation for structured knowledge analysis in DR agents and facilitate future multimodal DR studies.
CVJan 15
ROMA: Real-time Omni-Multimodal Assistant with Interactive Streaming UnderstandingXueyun Tian, Wei Li, Bingbing Xu et al.
Recent Omni-multimodal Large Language Models show promise in unified audio, vision, and text modeling. However, streaming audio-video understanding remains challenging, as existing approaches suffer from disjointed capabilities: they typically exhibit incomplete modality support or lack autonomous proactive monitoring. To address this, we present ROMA, a real-time omni-multimodal assistant for unified reactive and proactive interaction. ROMA processes continuous inputs as synchronized multimodal units, aligning dense audio with discrete video frames to handle granularity mismatches. For online decision-making, we introduce a lightweight speak head that decouples response initiation from generation to ensure precise triggering without task conflict. We train ROMA with a curated streaming dataset and a two-stage curriculum that progressively optimizes for streaming format adaptation and proactive responsiveness. To standardize the fragmented evaluation landscape, we reorganize diverse benchmarks into a unified suite covering both proactive (alert, narration) and reactive (QA) settings. Extensive experiments across 12 benchmarks demonstrate ROMA achieves state-of-the-art performance on proactive tasks while competitive in reactive settings, validating its robustness in unified real-time omni-multimodal understanding.
AIJan 9
HAG: Hierarchical Demographic Tree-based Agent Generation for Topic-Adaptive SimulationRongxin Chen, Tianyu Wu, Bingbing Xu et al.
High-fidelity agent initialization is crucial for credible Agent-Based Modeling across diverse domains. A robust framework should be Topic-Adaptive, capturing macro-level joint distributions while ensuring micro-level individual rationality. Existing approaches fall into two categories: static data-based retrieval methods that fail to adapt to unseen topics absent from the data, and LLM-based generation methods that lack macro-level distribution awareness, resulting in inconsistencies between micro-level persona attributes and reality. To address these problems, we propose HAG, a Hierarchical Agent Generation framework that formalizes population generation as a two-stage decision process. Firstly, utilizing a World Knowledge Model to infer hierarchical conditional probabilities to construct the Topic-Adaptive Tree, achieving macro-level distribution alignment. Then, grounded real-world data, instantiation and agentic augmentation are carried out to ensure micro-level consistency. Given the lack of specialized evaluation, we establish a multi-domain benchmark and a comprehensive PACE evaluation framework. Extensive experiments show that HAG significantly outperforms representative baselines, reducing population alignment errors by an average of 37.7% and enhancing sociological consistency by 18.8%.
CLJan 9
GIFT: Games as Informal Training for Generalizable LLMsNuoyan Lyu, Bingbing Xu, Weihao Meng et al.
While Large Language Models (LLMs) have achieved remarkable success in formal learning tasks such as mathematics and code generation, they still struggle with the "practical wisdom" and generalizable intelligence, such as strategic creativity and social reasoning, that characterize human cognition. This gap arises from a lack of informal learning, which thrives on interactive feedback rather than goal-oriented instruction. In this paper, we propose treating Games as a primary environment for LLM informal learning, leveraging their intrinsic reward signals and abstracted complexity to cultivate diverse competencies. To address the performance degradation observed in multi-task learning, we introduce a Nested Training Framework. Unlike naive task mixing optimizing an implicit "OR" objective, our framework employs sequential task composition to enforce an explicit "AND" objective, compelling the model to master multiple abilities simultaneously to achieve maximal rewards. Using GRPO-based reinforcement learning across Matrix Games, TicTacToe, and Who's the Spy games, we demonstrate that integrating game-based informal learning not only prevents task interference but also significantly bolsters the model's generalization across broad ability-oriented benchmarks. The framework and implementation are publicly available.
CVFeb 28, 2025Code
MIGE: Mutually Enhanced Multimodal Instruction-Based Image Generation and EditingXueyun Tian, Wei Li, Bingbing Xu et al.
Despite significant progress in diffusion-based image generation, subject-driven generation and instruction-based editing remain challenging. Existing methods typically treat them separately, struggling with limited high-quality data and poor generalization. However, both tasks require capturing complex visual variations while maintaining consistency between inputs and outputs. Inspired by this, we propose MIGE, a unified framework that standardizes task representations using multimodal instructions. It first treats subject-driven generation as creation on a blank canvas and instruction-based editing as modification of an existing image, establishing a shared input-output formulation, then introduces a novel multimodal encoder that maps free-form multimodal instructions into a unified vision-language space, integrating visual and semantic features through a feature fusion mechanism. This unification enables joint training of both tasks, providing two key advantages: (1) Cross-Task Enhancement: by leveraging shared visual and semantic representations, joint training improves instruction adherence and visual consistency in both subject-driven generation and instruction-based editing. (2) Generalization: learning in a unified format facilitates cross-task knowledge transfer, enabling MIGE to generalize to novel compositional tasks, including instruction-based subject-driven editing. Experiments show that MIGE excels in both subject-driven generation and instruction-based editing while setting a SOTA in the new task of instruction-based subject-driven editing. Code and model have been publicly available at https://github.com/Eureka-Maggie/MIGE.
CEMar 17
Physics-guided diffusion models for inverse design of disordered metamaterialsZiyuan Xie, Weipeng Xu, Dazhi Zhao et al.
Disordered metamaterials are promising for programming physical properties across diverse applications, yet their inverse design remains challenging due to the non-intuitive structure-property relationships and large design spaces. Recent generative approaches, particularly diffusion models, have shown potential in high-dimensional inverse design tasks. However, existing methods typically rely on carefully crafted training objectives, such as conditional data-driven or physics-informed loss functions. Because these strategies are inherently task-specific, the model must be retrained from scratch whenever the design problem changes (e.g., different governing equations, boundary conditions, or design objectives), severely limiting their flexibility and generalization ability. In this work, we propose physics-guided diffusion models that leverage differentiable physics-based solvers to instantly guide the generative process for inverse design. Drawing inspiration from classifier guidance, we develop a sampling strategy that directly incorporates physics guidance into the reverse stochastic differential equations. Our approach enables task-adaptive generation using gradients from differentiable solvers, while the diffusion model itself needs to be trained only once on unlabeled data. Focusing on disordered foam metamaterials, we present three representative design tasks: (1) achieving target effective thermal conductivity, (2) matching desired load-displacement response, and (3) maximizing energy absorption involving fractures. In each scenario, the proposed method successfully generates foam-like geometries that fulfill the prescribed physical objectives. These results demonstrate the versatility, efficiency, and practicality of physics-guided diffusion models for tackling complex inverse design problems in disordered metamaterials and beyond.
LGMay 7, 2025Code
InfoNCE is a Free Lunch for Semantically guided Graph Contrastive LearningZixu Wang, Bingbing Xu, Yige Yuan et al.
As an important graph pre-training method, Graph Contrastive Learning (GCL) continues to play a crucial role in the ongoing surge of research on graph foundation models or LLM as enhancer for graphs. Traditional GCL optimizes InfoNCE by using augmentations to define self-supervised tasks, treating augmented pairs as positive samples and others as negative. However, this leads to semantically similar pairs being classified as negative, causing significant sampling bias and limiting performance. In this paper, we argue that GCL is essentially a Positive-Unlabeled (PU) learning problem, where the definition of self-supervised tasks should be semantically guided, i.e., augmented samples with similar semantics are considered positive, while others, with unknown semantics, are treated as unlabeled. From this perspective, the key lies in how to extract semantic information. To achieve this, we propose IFL-GCL, using InfoNCE as a "free lunch" to extract semantic information. Specifically, We first prove that under InfoNCE, the representation similarity of node pairs aligns with the probability that the corresponding contrastive sample is positive. Then we redefine the maximum likelihood objective based on the corrected samples, leading to a new InfoNCE loss function. Extensive experiments on both the graph pretraining framework and LLM as an enhancer show significantly improvements of IFL-GCL in both IID and OOD scenarios, achieving up to a 9.05% improvement, validating the effectiveness of semantically guided. Code for IFL-GCL is publicly available at: https://github.com/Camel-Prince/IFL-GCL.
CLOct 27, 2025Code
Multi-Personality Generation of LLMs at Decoding-timeRongxin Chen, Yunfan Li, Yige Yuan et al.
Multi-personality generation for LLMs, enabling simultaneous embodiment of multiple personalization attributes, is a fundamental challenge. Existing retraining-based approaches are costly and poorly scalable, while decoding-time methods often rely on external models or heuristics, limiting flexibility and robustness. In this paper, we propose a novel Multi-Personality Generation (MPG) framework under the decoding-time combination paradigm. It flexibly controls multi-personality without relying on scarce multi-dimensional models or extra training, leveraging implicit density ratios in single-dimensional models as a "free lunch" to reformulate the task as sampling from a target strategy aggregating these ratios. To implement MPG efficiently, we design Speculative Chunk-level based Rejection sampling (SCR), which generates responses in chunks and parallelly validates them via estimated thresholds within a sliding window. This significantly reduces computational overhead while maintaining high-quality generation. Experiments on MBTI personality and Role-Playing demonstrate the effectiveness of MPG, showing improvements up to 16%-18%. Code and data are available at https://github.com/Libra117/MPG .
CLNov 20, 2024
Fact-Level Confidence Calibration and Self-CorrectionYige Yuan, Bingbing Xu, Hexiang Tan et al.
Confidence calibration in LLMs, i.e., aligning their self-assessed confidence with the actual accuracy of their responses, enabling them to self-evaluate the correctness of their outputs. However, current calibration methods for LLMs typically estimate two scalars to represent overall response confidence and correctness, which is inadequate for long-form generation where the response includes multiple atomic facts and may be partially confident and correct. These methods also overlook the relevance of each fact to the query. To address these challenges, we propose a Fact-Level Calibration framework that operates at a finer granularity, calibrating confidence to relevance-weighted correctness at the fact level. Furthermore, comprehensive analysis under the framework inspired the development of Confidence-Guided Fact-level Self-Correction ($\textbf{ConFix}$), which uses high-confidence facts within a response as additional knowledge to improve low-confidence ones. Extensive experiments across four datasets and six models demonstrate that ConFix effectively mitigates hallucinations without requiring external knowledge sources such as retrieval systems.
CLJun 14, 2025
From Outcomes to Processes: Guiding PRM Learning from ORM for Inference-Time AlignmentBin Xie, Bingbing Xu, Yige Yuan et al.
Inference-time alignment methods have gained significant attention for their efficiency and effectiveness in aligning large language models (LLMs) with human preferences. However, existing dominant approaches using reward-guided search (RGS) primarily rely on outcome reward models (ORMs), which suffer from a critical granularity mismatch: ORMs are designed to provide outcome rewards for complete responses, while RGS methods rely on process rewards to guide the policy, leading to inconsistent scoring and suboptimal alignment. To address this challenge, we introduce process reward models (PRMs) into RGS and argue that an ideal PRM should satisfy two objectives: Score Consistency, ensuring coherent evaluation across partial and complete responses, and Preference Consistency, aligning partial sequence assessments with human preferences. Based on these, we propose SP-PRM, a novel dual-consistency framework integrating score consistency-based and preference consistency-based partial evaluation modules without relying on human annotation. Extensive experiments on dialogue, summarization, and reasoning tasks demonstrate that SP-PRM substantially enhances existing RGS methods, achieving a 3.6%-10.3% improvement in GPT-4 evaluation scores across all tasks.
LGOct 12, 2024
MITA: Bridging the Gap between Model and Data for Test-time AdaptationYige Yuan, Bingbing Xu, Teng Xiao et al.
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models. However, existing mainstream TTA methods, predominantly operating at batch level, often exhibit suboptimal performance in complex real-world scenarios, particularly when confronting outliers or mixed distributions. This phenomenon stems from a pronounced over-reliance on statistical patterns over the distinct characteristics of individual instances, resulting in a divergence between the distribution captured by the model and data characteristics. To address this challenge, we propose Meet-In-The-Middle based Test-Time Adaptation ($\textbf{MITA}$), which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions, thereby meeting in the middle. MITA pioneers a significant departure from traditional approaches that focus solely on aligning the model to the data, facilitating a more effective bridging of the gap between model's distribution and data characteristics. Comprehensive experiments with MITA across three distinct scenarios (Outlier, Mixture, and Pure) demonstrate its superior performance over SOTA methods, highlighting its potential to significantly enhance generalizability in practical applications.
CLJan 19
Towards Robust Process Reward Modeling via Noise-aware LearningBin Xie, Bingbing Xu, Xueyun Tian et al.
Process Reward Models (PRMs) have achieved strong results in complex reasoning, but are bottlenecked by costly process-level supervision. A widely used alternative, Monte Carlo Estimation (MCE), defines process rewards as the probability that a policy model reaches the correct final answer from a given reasoning step. However, step correctness is an intrinsic property of the reasoning trajectory, and should be invariant to policy choice. Our empirical findings show that MCE producing policy-dependent rewards that induce label noise, including false positives that reward incorrect steps and false negatives that penalize correct ones. To address above challenges, we propose a two-stage framework to mitigate noisy supervision. In the labeling stage, we introduce a reflection-aware label correction mechanism that uses a large language model (LLM) as a judge to detect reflection and self-correction behaviors related to the current reasoning step, thereby suppressing overestimated rewards. In the training stage, we further propose a \underline{\textbf{N}}oise-\underline{\textbf{A}}ware \underline{\textbf{I}}terative \underline{\textbf{T}}raining framework that enables the PRM to progressively refine noisy labels based on its own confidence. Extensive Experiments show that our method substantially improves step-level correctness discrimination, achieving up to a 27\% absolute gain in average F1 over PRMs trained with noisy supervision.
LGMay 25, 2023
PDE+: Enhancing Generalization via PDE with Adaptive Distributional DiffusionYige Yuan, Bingbing Xu, Bo Lin et al.
The generalization of neural networks is a central challenge in machine learning, especially concerning the performance under distributions that differ from training ones. Current methods, mainly based on the data-driven paradigm such as data augmentation, adversarial training, and noise injection, may encounter limited generalization due to model non-smoothness. In this paper, we propose to investigate generalization from a Partial Differential Equation (PDE) perspective, aiming to enhance it directly through the underlying function of neural networks, rather than focusing on adjusting input data. Specifically, we first establish the connection between neural network generalization and the smoothness of the solution to a specific PDE, namely "transport equation". Building upon this, we propose a general framework that introduces adaptive distributional diffusion into transport equation to enhance the smoothness of its solution, thereby improving generalization. In the context of neural networks, we put this theoretical framework into practice as $\textbf{PDE+}$ ($\textbf{PDE}$ with $\textbf{A}$daptive $\textbf{D}$istributional $\textbf{D}$iffusion) which diffuses each sample into a distribution covering semantically similar inputs. This enables better coverage of potentially unobserved distributions in training, thus improving generalization beyond merely data-driven methods. The effectiveness of PDE+ is validated through extensive experimental settings, demonstrating its superior performance compared to SOTA methods.
LGMay 25, 2023
IDEA: Invariant Defense for Graph Adversarial RobustnessShuchang Tao, Qi Cao, Huawei Shen et al.
Despite the success of graph neural networks (GNNs), their vulnerability to adversarial attacks poses tremendous challenges for practical applications. Existing defense methods suffer from severe performance decline under unseen attacks, due to either limited observed adversarial examples or pre-defined heuristics. To address these limitations, we analyze the causalities in graph adversarial attacks and conclude that causal features are key to achieve graph adversarial robustness, owing to their determinedness for labels and invariance across attacks. To learn these causal features, we innovatively propose an Invariant causal DEfense method against adversarial Attacks (IDEA). We derive node-based and structure-based invariance objectives from an information-theoretic perspective. IDEA ensures strong predictability for labels and invariant predictability across attacks, which is provably a causally invariant defense across various attacks. Extensive experiments demonstrate that IDEA attains state-of-the-art defense performance under all five attacks on all five datasets. The implementation of IDEA is available at https://anonymous.4open.science/r/IDEA.
LGDec 6, 2021
Spatio-Temporal meets Wavelet: Disentangled Traffic Flow Forecasting via Efficient Spectral Graph Attention NetworkYuchen Fang, Yanjun Qin, Haiyong Luo et al.
Traffic forecasting is crucial for public safety and resource optimization, yet is very challenging due to three aspects: i) current existing works mostly exploit intricate temporal patterns (e.g., the short-term thunderstorm and long-term daily trends) within a single method, which fail to accurately capture spatio-temporal dependencies under different schemas; ii) the under-exploration of the graph positional encoding limit the extraction of spatial information in the commonly used full graph attention network; iii) the quadratic complexity of the full graph attention introduces heavy computational needs. To achieve the effective traffic flow forecasting, we propose an efficient spectral graph attention network with disentangled traffic sequences. Specifically, the discrete wavelet transform is leveraged to obtain the low- and high-frequency components of traffic sequences, and a dual-channel encoder is elaborately designed to accurately capture the spatio-temporal dependencies under long- and short-term schemas of the low- and high-frequency components. Moreover, a novel wavelet-based graph positional encoding and a query sampling strategy are introduced in our spectral graph attention to effectively guide message passing and efficiently calculate the attention. Extensive experiments on four real-world datasets show the superiority of our model, i.e., the higher traffic forecasting precision with lower computational cost.
LGJul 27, 2020
Graph Convolutional Networks using Heat Kernel for Semi-supervised LearningBingbing Xu, Huawei Shen, Qi Cao et al.
Graph convolutional networks gain remarkable success in semi-supervised learning on graph structured data. The key to graph-based semisupervised learning is capturing the smoothness of labels or features over nodes exerted by graph structure. Previous methods, spectral methods and spatial methods, devote to defining graph convolution as a weighted average over neighboring nodes, and then learn graph convolution kernels to leverage the smoothness to improve the performance of graph-based semi-supervised learning. One open challenge is how to determine appropriate neighborhood that reflects relevant information of smoothness manifested in graph structure. In this paper, we propose GraphHeat, leveraging heat kernel to enhance low-frequency filters and enforce smoothness in the signal variation on the graph. GraphHeat leverages the local structure of target node under heat diffusion to determine its neighboring nodes flexibly, without the constraint of order suffered by previous methods. GraphHeat achieves state-of-the-art results in the task of graph-based semi-supervised classification across three benchmark datasets: Cora, Citeseer and Pubmed.
LGJul 27, 2020
Label-Consistency based Graph Neural Networks for Semi-supervised Node ClassificationBingbing Xu, Junjie Huang, Liang Hou et al.
Graph neural networks (GNNs) achieve remarkable success in graph-based semi-supervised node classification, leveraging the information from neighboring nodes to improve the representation learning of target node. The success of GNNs at node classification depends on the assumption that connected nodes tend to have the same label. However, such an assumption does not always work, limiting the performance of GNNs at node classification. In this paper, we propose label-consistency based graph neural network(LC-GNN), leveraging node pairs unconnected but with the same labels to enlarge the receptive field of nodes in GNNs. Experiments on benchmark datasets demonstrate the proposed LC-GNN outperforms traditional GNNs in graph-based semi-supervised node classification.We further show the superiority of LC-GNN in sparse scenarios with only a handful of labeled nodes.
SIJun 20, 2019
ANAE: Learning Node Context Representation for Attributed Network EmbeddingKeting Cen, Huawei Shen, Jinhua Gao et al.
Attributed network embedding aims to learn low-dimensional node representations from both network structure and node attributes. Existing methods can be categorized into two groups: (1) the first group learns two separated node representations from network structure and node attribute respectively and concatenates them together; (2) the other group obtains node representations by translating node attributes into network structure or vice versa. However, both groups have their drawbacks. The first group neglects the correlation between network structure and node attributes, while the second group assumes strong dependence between these two types of information. In this paper, we address attributed network embedding from a novel perspective, i.e., learning node context representation for each node via modeling its attributed local subgraph. To achieve this goal, we propose a novel attributed network auto-encoder framework, namely ANAE. For a target node, ANAE first aggregates the attribute information from its attributed local subgraph, obtaining its low-dimensional representation. Next, ANAE diffuses the representation of the target node to nodes in its local subgraph to reconstruct their attributes. Such an encoder-decoder framework allows the learned representations to better preserve the context information manifested in both network structure and node attributes, thus having high capacity to learn good node representations for attributed network. Extensive experimental results on real-world datasets demonstrate that the proposed framework outperforms the state-of-the-art approaches at the tasks of link prediction and node classification.
LGApr 12, 2019
Graph Wavelet Neural NetworkBingbing Xu, Huawei Shen, Qi Cao et al.
We present graph wavelet neural network (GWNN), a novel graph convolutional neural network (CNN), leveraging graph wavelet transform to address the shortcomings of previous spectral graph CNN methods that depend on graph Fourier transform. Different from graph Fourier transform, graph wavelet transform can be obtained via a fast algorithm without requiring matrix eigendecomposition with high computational cost. Moreover, graph wavelets are sparse and localized in vertex domain, offering high efficiency and good interpretability for graph convolution. The proposed GWNN significantly outperforms previous spectral graph CNNs in the task of graph-based semi-supervised classification on three benchmark datasets: Cora, Citeseer and Pubmed.