LGNov 4, 2023
From Trojan Horses to Castle Walls: Unveiling Bilateral Data Poisoning Effects in Diffusion ModelsZhuoshi Pan, Yuguang Yao, Gaowen Liu et al.
While state-of-the-art diffusion models (DMs) excel in image generation, concerns regarding their security persist. Earlier research highlighted DMs' vulnerability to data poisoning attacks, but these studies placed stricter requirements than conventional methods like `BadNets' in image classification. This is because the art necessitates modifications to the diffusion training and sampling procedures. Unlike the prior work, we investigate whether BadNets-like data poisoning methods can directly degrade the generation by DMs. In other words, if only the training dataset is contaminated (without manipulating the diffusion process), how will this affect the performance of learned DMs? In this setting, we uncover bilateral data poisoning effects that not only serve an adversarial purpose (compromising the functionality of DMs) but also offer a defensive advantage (which can be leveraged for defense in classification tasks against poisoning attacks). We show that a BadNets-like data poisoning attack remains effective in DMs for producing incorrect images (misaligned with the intended text conditions). Meanwhile, poisoned DMs exhibit an increased ratio of triggers, a phenomenon we refer to as `trigger amplification', among the generated images. This insight can be then used to enhance the detection of poisoned training data. In addition, even under a low poisoning ratio, studying the poisoning effects of DMs is also valuable for designing robust image classifiers against such attacks. Last but not least, we establish a meaningful linkage between data poisoning and the phenomenon of data replications by exploring DMs' inherent data memorization tendencies.
LGNov 10, 2024Code
UniGAD: Unifying Multi-level Graph Anomaly DetectionYiqing Lin, Jianheng Tang, Chenyi Zi et al.
Graph Anomaly Detection (GAD) aims to identify uncommon, deviated, or suspicious objects within graph-structured data. Existing methods generally focus on a single graph object type (node, edge, graph, etc.) and often overlook the inherent connections among different object types of graph anomalies. For instance, a money laundering transaction might involve an abnormal account and the broader community it interacts with. To address this, we present UniGAD, the first unified framework for detecting anomalies at node, edge, and graph levels jointly. Specifically, we develop the Maximum Rayleigh Quotient Subgraph Sampler (MRQSampler) that unifies multi-level formats by transferring objects at each level into graph-level tasks on subgraphs. We theoretically prove that MRQSampler maximizes the accumulated spectral energy of subgraphs (i.e., the Rayleigh quotient) to preserve the most significant anomaly information. To further unify multi-level training, we introduce a novel GraphStitch Network to integrate information across different levels, adjust the amount of sharing required at each level, and harmonize conflicting training goals. Comprehensive experiments show that UniGAD outperforms both existing GAD methods specialized for a single task and graph prompt-based approaches for multiple tasks, while also providing robust zero-shot task transferability. All codes can be found at https://github.com/lllyyq1121/UniGAD.
CLJul 9, 2025Code
InvestAlign: Overcoming Data Scarcity in Aligning Large Language Models with Investor Decision-Making Processes under Herd BehaviorHuisheng Wang, Zhuoshi Pan, Hangjing Zhang et al. · tsinghua
Aligning Large Language Models (LLMs) with investor decision-making processes under herd behavior is a critical challenge in behavioral finance, which grapples with a fundamental limitation: the scarcity of real-user data needed for Supervised Fine-Tuning (SFT). While SFT can bridge the gap between LLM outputs and human behavioral patterns, its reliance on massive authentic data imposes substantial collection costs and privacy risks. We propose InvestAlign, a novel framework that constructs high-quality SFT datasets by leveraging theoretical solutions to similar and simple optimal investment problems rather than complex scenarios. Our theoretical analysis demonstrates that training LLMs with InvestAlign-generated data achieves faster parameter convergence than using real-user data, suggesting superior learning efficiency. Furthermore, we develop InvestAgent, an LLM agent fine-tuned with InvestAlign, which demonstrates significantly closer alignment to real-user data than pre-SFT models in both simple and complex investment problems. This highlights our proposed InvestAlign as a promising approach with the potential to address complex optimal investment problems and align LLMs with investor decision-making processes under herd behavior. Our code is publicly available at https://github.com/thu-social-network-research-group/InvestAlign.
AIAug 28, 2023
Spread Control Method on Unknown Networks Based on Hierarchical Reinforcement LearningWenxiang Dong, Zhanjiang Chen, H. Vicky Zhao
Epidemics such as COVID-19 pose serious threats to public health and our society, and it is critical to investigate effective methods to control the spread of epidemics over networks. Prior works on epidemic control often assume complete knowledge of network structures, a presumption seldom valid in real-world situations. In this paper, we study epidemic control on networks with unknown structures, and propose a hierarchical reinforcement learning framework for joint network structure exploration and epidemic control. To reduce the action space and achieve computation tractability, our proposed framework contains three modules: the Policy Selection Module, which determines whether to explore the structure or remove nodes to control the epidemic; the Explore Module, responsible for selecting nodes to explore; and the Remove Module, which decides which nodes to remove to stop the epidemic spread. Simulation results show that our proposed method outperforms baseline methods.
59.1MFApr 13
Mechanism Design for Investment Regulation under HerdingHuisheng Wang, H. Vicky Zhao
Herding, where investors imitate others' decisions rather than relying on their own analysis, is a prevalent phenomenon in financial markets. Excessive herding distorts rational decisions, amplifies volatility, and can be exploited by manipulators to harm the market. Traditional regulatory tools, such as information disclosure and transaction restrictions, are often imprecise and lack theoretical guarantees for effectiveness. This calls for a quantitative approach to regulating herding. We propose a regulator-leader-follower trilateral game framework based on optimal control theory to study the complex dynamics among them. The leader makes rational decisions, the follower maximizes utility while aligning with the leader's decisions, whereas the regulator designs a mechanism to maximize social welfare and minimize regulatory cost. We derive the follower's decisions and the regulator's mechanisms, theoretically analyze the impact of regulation on decisions, and investigate effective mechanisms to improve social welfare.
LGMay 19, 2025Code
Lightweight and Interpretable Transformer via Mixed Graph Algorithm Unrolling for Traffic ForecastJi Qi, Tam Thuc Do, Mingxiao Liu et al.
Unlike conventional "black-box" transformers with classical self-attention mechanism, we build a lightweight and interpretable transformer-like neural net by unrolling a mixed-graph-based optimization algorithm to forecast traffic with spatial and temporal dimensions. We construct two graphs: an undirected graph $\mathcal{G}^u$ capturing spatial correlations across geography, and a directed graph $\mathcal{G}^d$ capturing sequential relationships over time. We predict future samples of signal $\mathbf{x}$, assuming it is "smooth" with respect to both $\mathcal{G}^u$ and $\mathcal{G}^d$, where we design new $\ell_2$ and $\ell_1$-norm variational terms to quantify and promote signal smoothness (low-frequency reconstruction) on a directed graph. We design an iterative algorithm based on alternating direction method of multipliers (ADMM), and unroll it into a feed-forward network for data-driven parameter learning. We insert graph learning modules for $\mathcal{G}^u$ and $\mathcal{G}^d$ that play the role of self-attention. Experiments show that our unrolled networks achieve competitive traffic forecast performance as state-of-the-art prediction schemes, while reducing parameter counts drastically. Our code is available in https://github.com/SingularityUndefined/Unrolling-GSP-STForecast .
CLMar 19, 2024Code
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt CompressionZhuoshi Pan, Qianhui Wu, Huiqiang Jiang et al.
This paper focuses on task-agnostic prompt compression for better generalizability and efficiency. Considering the redundancy in natural language, existing approaches compress prompts by removing tokens or lexical units according to their information entropy obtained from a causal language model such as LLaMa-7B. The challenge is that information entropy may be a suboptimal compression metric: (i) it only leverages unidirectional context and may fail to capture all essential information needed for prompt compression; (ii) it is not aligned with the prompt compression objective. To address these issues, we propose a data distillation procedure to derive knowledge from an LLM to compress prompts without losing crucial information, and meantime, introduce an extractive text compression dataset. We formulate prompt compression as a token classification problem to guarantee the faithfulness of the compressed prompt to the original one, and use a Transformer encoder as the base architecture to capture all essential information for prompt compression from the full bidirectional context. Our approach leads to lower latency by explicitly learning the compression objective with smaller models such as XLM-RoBERTa-large and mBERT. We evaluate our method on both in-domain and out-of-domain datasets, including MeetingBank, LongBench, ZeroScrolls, GSM8K, and BBH. Despite its small size, our model shows significant performance gains over strong baselines and demonstrates robust generalization ability across different LLMs. Additionally, our model is 3x-6x faster than existing prompt compression methods, while accelerating the end-to-end latency by 1.6x-2.9x with compression ratios of 2x-5x. Our code is available at https://aka.ms/LLMLingua-2.
IRMar 10, 2023
Pacos: Modeling Users' Interpretable and Context-Dependent Choices in Preference ReversalsQingming Li, H. Vicky Zhao
Choice problems refer to selecting the best choices from several items, and learning users' preferences in choice problems is of great significance in understanding the decision making mechanisms and providing personalized services. Existing works typically assume that people evaluate items independently. In practice, however, users' preferences depend on the market in which items are placed, which is known as context effects; and the order of users' preferences for two items may even be reversed, which is referred to preference reversals. In this work, we identify three factors contributing to context effects: users' adaptive weights, the inter-item comparison, and display positions. We propose a context-dependent preference model named Pacos as a unified framework for addressing three factors simultaneously, and consider two design methods including an additive method with high interpretability and an ANN-based method with high accuracy. We study the conditions for preference reversals to occur and provide an theoretical proof of the effectiveness of Pacos in addressing preference reversals. Experimental results show that the proposed method has better performance than prior works in predicting users' choices, and has great interpretability to help understand the cause of preference reversals.
IRMar 9, 2023
Probe: Learning Users' Personalized Projection Bias in Intertemporal ChoicesQingming Li, H. Vicky Zhao
Intertemporal choices involve making decisions that require weighing the costs in the present against the benefits in the future. One specific type of intertemporal choice is the decision between purchasing an individual item or opting for a bundle that includes that item. Previous research assumes that individuals have accurate expectations of the factors involved in these choices. However, in reality, users' perceptions of these factors are often biased, leading to irrational and suboptimal decision-making. In this work, we specifically focus on two commonly observed biases: projection bias and the reference-point effect. To address these biases, we propose a novel bias-embedded preference model called Probe. The Probe incorporates a weight function to capture users' projection bias and a value function to account for the reference-point effect, and introduce prospect theory from behavioral economics to combine the weight and value functions. This allows us to determine the probability of users selecting the bundle or a single item. We provide a thorough theoretical analysis to demonstrate the impact of projection bias on the design of bundle sales strategies. Through experimental results, we show that the proposed Probe model outperforms existing methods and contributes to a better understanding of users' irrational behaviors in bundle purchases. This investigation can facilitate a deeper comprehension of users' decision-making mechanisms, enable the provision of personalized services, and assist users in making more rational and optimal decisions.
CLFeb 8, 2025
On Memory Construction and Retrieval for Personalized Conversational AgentsZhuoshi Pan, Qianhui Wu, Huiqiang Jiang et al. · microsoft-research
To deliver coherent and personalized experiences in long-term conversations, existing approaches typically perform retrieval augmented response generation by constructing memory banks from conversation history at either the turn-level, session-level, or through summarization techniques.In this paper, we present two key findings: (1) The granularity of memory unit matters: turn-level, session-level, and summarization-based methods each exhibit limitations in both memory retrieval accuracy and the semantic quality of the retrieved content. (2) Prompt compression methods, such as LLMLingua-2, can effectively serve as a denoising mechanism, enhancing memory retrieval accuracy across different granularities. Building on these insights, we propose SeCom, a method that constructs the memory bank at segment level by introducing a conversation segmentation model that partitions long-term conversations into topically coherent segments, while applying compression based denoising on memory units to enhance memory retrieval. Experimental results show that SeCom exhibits a significant performance advantage over baselines on long-term conversation benchmarks LOCOMO and Long-MT-Bench+. Additionally, the proposed conversation segmentation method demonstrates superior performance on dialogue segmentation datasets such as DialSeg711, TIAGE, and SuperDialSeg.
LGMar 21, 2025
LEMMA: Learning from Errors for MatheMatical Advancement in LLMsZhuoshi Pan, Yu Li, Honglin Lin et al.
Large language models (LLMs) have demonstrated remarkable reasoning capability in solving mathematical problems. However, existing approaches primarily focus on improving the quality of correct training data, e.g., distilling high-quality correct solutions from advanced models, neglecting the value contained in error data, potentially hindering the model's reflective ability. Though some studies attempt to leverage error data, they often involve complex mechanisms, such as Monte Carlo Tree Search (MCTS) to explore error nodes. In this work, we propose to enhance LLMs' reasoning ability by Learning from Errors for Mathematical Advancement (LEMMA). LEMMA constructs data consisting of an incorrect solution with an erroneous step and a reflection connection to a correct solution for fine-tuning. Specifically, we systematically analyze the model-generated error types and introduce an error-type grounded mistake augmentation method to collect diverse and representative errors. Correct solutions are either from fixing the errors or generating a fresh start. Through a model-aware smooth reflection connection, the erroneous solution is transferred to the correct one. By fine-tuning on the constructed dataset, the model is able to self-correct errors autonomously within the generation process without relying on external critique models. Experimental results demonstrate that LEMMA achieves significant performance improvements over other strong baselines.
CLJul 14, 2025
REST: Stress Testing Large Reasoning Models by Asking Multiple Problems at OnceZhuoshi Pan, Qizhi Pei, Yu Li et al.
Recent Large Reasoning Models (LRMs) have achieved remarkable progress on task-specific benchmarks, yet their evaluation methods remain constrained by isolated problem-solving paradigms. Existing benchmarks predominantly assess single-question reasoning through sequential testing, resulting critical limitations: (1) vulnerability to data contamination and less challenging (e.g., DeepSeek-R1 achieves 97.0% on MATH500), forcing costly creation of new questions with large human efforts, (2) failure to evaluate models under multi-context pressure, a key requirement for real-world deployment. To bridge this gap, we present REST (Reasoning Evaluation through Simultaneous Testing), a stress-testing framework that exposes LRMs to multiple problems simultaneously. Beyond basic reasoning, REST evaluates several under-tested capabilities: contextual priority allocation, cross-problem interference resistance, and dynamic cognitive load management. Our evaluation reveals several striking findings: Even state-of-the-art (SOTA) models like DeepSeek-R1 exhibit substantial performance degradation under stress testing. Crucially, REST demonstrates stronger discriminative power than existing benchmarks, revealing pronounced performance differences among models that exhibit similar, near-ceiling performance under single-question evaluations. Some key insights emerge from our analysis: (1) the "overthinking trap" is a critical factor contributing to the performance degradation; (2) the models trained with "long2short" technique preserve more accuracy of their single-problem performance under REST, outperforming standard-trained counterparts. These results establish REST as a cost-efficient, future-proof evaluation paradigm that better reflects real-world reasoning demands while reducing reliance on continuous human annotation. Code and results are available at https://opendatalab.github.io/REST.
LGMay 22, 2024
Emulating Full Participation: An Effective and Fair Client Selection Strategy for Federated LearningQingming Li, Juzheng Miao, Puning Zhao et al.
In federated learning, client selection is a critical problem that significantly impacts both model performance and fairness. Prior studies typically treat these two objectives separately, or balance them using simple weighting schemes. However, we observe that commonly used metrics for model performance and fairness often conflict with each other, and a straightforward weighted combination is insufficient to capture their complex interactions. To address this, we first propose two guiding principles that directly tackle the inherent conflict between the two metrics while reinforcing each other. Based on these principles, we formulate the client selection problem as a long-term optimization task, leveraging the Lyapunov function and the submodular nature of the problem to solve it effectively. Experiments show that the proposed method improves both model performance and fairness, guiding the system to converge comparably to full client participation. This improvement can be attributed to the fact that both model performance and fairness benefit from the diversity of the selected clients' data distributions. Our approach adaptively enhances this diversity by selecting clients based on their data distributions, thereby improving both model performance and fairness.
MMApr 14, 2021
Landmarking for Navigational Streaming of Stored High-Dimensional MediaYuan Yuan, Gene Cheung, Pascal Frossard et al.
Modern media data such as 360 videos and light field (LF) images are typically captured in much higher dimensions than the observers' visual displays. To efficiently browse high-dimensional media over bandwidth-constrained networks, a navigational streaming model is considered: a client navigates the large media space by dictating a navigation path to a server, who in response transmits the corresponding pre-encoded media data units (MDU) to the client one-by-one in sequence. Intra-coding an MDU (I-MDU) would result in a large bitrate but I-MDU can be randomly accessed, while inter-coding an MDU (P-MDU) using another MDU as a predictor incurs a small coding cost but imposes an order where the predictor must be first transmitted and decoded. From a compression perspective, the technical challenge is: how to achieve coding gain via inter-coding of MDUs, while enabling adequate random access for satisfactory user navigation. To address this problem, we propose landmarks, a selection of key MDUs from the high-dimensional media. Using landmarks as predictors, nearby MDUs in local neighborhoods are intercoded, resulting in a predictive MDU structure with controlled coding cost. It means that any requested MDU can be decoded by at most transmitting a landmark and an inter-coded MDU, enabling navigational random access. To build a landmarked MDU structure, we employ tree-structured vector quantizer (TSVQ) to first optimize landmark locations, then iteratively add/remove inter-coded MDUs as refinements using a fast branch-and-bound technique. Taking interactive LF images and viewport adaptive 360 images as illustrative applications, and I-, P- and previously proposed merge frames to intra- and inter-code MDUs, we show experimentally that landmarked MDU structures can noticeably reduce the expected transmission cost compared with MDU structures without landmarks.