CLMay 27Code
Rethinking Memory as Continuously Evolving ConnectivityJizhan Fang, Buqiang Xu, Zhixian Wang et al.
Existing memory-augmented LLM agents often treat memory as a static repository with pre-defined representations and fixed retrieval pipelines, which is brittle in dynamic agentic environments where feedback, task variation, and heterogeneous signals continuously reshape what should be remembered and how it should be connected. To address this, we propose FluxMem, a connectivity-evolving memory framework that models memory as a heterogeneous graph and progressively refines its topology through three stages: initial connection formation, feedback-driven refinement, and long-term consolidation. During execution, FluxMem repairs missing links, prunes interference, aligns abstraction granularity, and distills recurrent successful trajectories into reusable procedural circuits, guided by one metric for memory generalizability and evolutionary maturity. Across three fundamentally distinct benchmarks including LoCoMo, Mind2Web, and GAIA, FluxMem achieves consistent state-of-the-art performance, demonstrating strong adaptation and generalization in complex agentic environments. The code will be open-sourced in https://github.com/zjunlp/LightMem.
LGJul 14, 2023
Benchmarks and Custom Package for Energy ForecastingZhixian Wang, Qingsong Wen, Chaoli Zhang et al.
Energy (load, wind, photovoltaic) forecasting is significant in the power industry as it can provide a reference for subsequent tasks such as power grid dispatch, thus bringing huge economic benefits. However, there are many differences between energy forecasting and traditional time series forecasting. On the one hand, traditional time series mainly focus on capturing characteristics like trends and cycles. In contrast, the energy series is largely influenced by many external factors, such as meteorological and calendar variables. On the other hand, energy forecasting aims to minimize the cost of subsequent tasks such as power grid dispatch, rather than simply pursuing prediction accuracy. In addition, the scale of energy data can also significantly impact the predicted results. In this paper, we collected large-scale load datasets and released a new renewable energy dataset that contains both station-level and region-level renewable generation data with meteorological data. For load data, we also included load domain-specific feature engineering and provided a method to customize the loss function and link the forecasting error to requirements related to subsequent tasks (such as power grid dispatching costs), integrating it into our forecasting framework. Based on such a situation, we conducted extensive experiments with 21 forecasting methods in these energy datasets at different levels under 11 evaluation metrics, providing a comprehensive reference for researchers to compare different energy forecasting models.
SYNov 7, 2023
An Explainable Framework for Machine learning-Based Reactive Power Optimization of Distribution NetworkWenlong Liao, Benjamin Schäfer, Dalin Qin et al.
To reduce the heavy computational burden of reactive power optimization of distribution networks, machine learning models are receiving increasing attention. However, most machine learning models (e.g., neural networks) are usually considered as black boxes, making it challenging for power system operators to identify and comprehend potential biases or errors in the decision-making process of machine learning models. To address this issue, an explainable machine-learning framework is proposed to optimize the reactive power in distribution networks. Firstly, a Shapley additive explanation framework is presented to measure the contribution of each input feature to the solution of reactive power optimizations generated from machine learning models. Secondly, a model-agnostic approximation method is developed to estimate Shapley values, so as to avoid the heavy computational burden associated with direct calculations of Shapley values. The simulation results show that the proposed explainable framework can accurately explain the solution of the machine learning model-based reactive power optimization by using visual analytics, from both global and instance perspectives. Moreover, the proposed explainable framework is model-agnostic, and thus applicable to various models (e.g., neural networks).
AISep 26, 2025Code
Benchmarking MLLM-based Web Understanding: Reasoning, Robustness and SafetyJunliang Liu, Jingyu Xiao, Wenxin Tang et al.
Multimodal large language models (MLLMs) are increasingly positioned as AI collaborators for building complex web-related applications like GUI agents and front-end code generation. However, existing benchmarks largely emphasize visual perception or UI code generation, showing insufficient evaluation on the reasoning, robustness and safety capability required for end-to-end web applications. To bridge the gap, we introduce a comprehensive web understanding benchmark, named WebRSSBench, that jointly evaluates Reasoning, Robustness, and Safety across eight tasks, such as position relationship reasoning, color robustness, and safety critical detection, etc. The benchmark is constructed from 729 websites and contains 3799 question answer pairs that probe multi-step inference over page structure, text, widgets, and safety-critical interactions. To ensure reliable measurement, we adopt standardized prompts, deterministic evaluation scripts, and multi-stage quality control combining automatic checks with targeted human verification. We evaluate 12 MLLMs on WebRSSBench. The results reveal significant gaps, models still struggle with compositional and cross-element reasoning over realistic layouts, show limited robustness when facing perturbations in user interfaces and content such as layout rearrangements or visual style shifts, and are rather conservative in recognizing and avoiding safety critical or irreversible actions. Our code is available at https://github.com/jinliang-byte/webssrbench.
LGApr 11
Closed-Form Concept Erasure via Double ProjectionsChi Zhang, Jingpu Cheng, Zhixian Wang et al.
While modern generative models such as diffusion-based architectures have enabled impressive creative capabilities, they also raise important safety and ethical risks. These concerns have led to growing interest in concept erasure, the process of removing unwanted concepts from model representations. Existing approaches often achieve strong erasure performance but rely on iterative optimization and may inadvertently distort unrelated concepts. In this work, we present a simple yet principled alternative: a linear transformation framework that achieves concept erasure analytically, without any training. Our method adapts a pretrained model through two sequential, closed-form steps: first, computing a proxy projection of the target concept, and second, applying a constrained transformation within the left null space of known concept directions. This design yields a deterministic and geometrically interpretable procedure for safe, efficient, and theory-grounded concept removal. Across a wide range of experiments, including object and style erasure on multiple Stable Diffusion variants and the flow-matching model (FLUX), our approach matches or surpasses the performance of state-of-the-art methods while preserving non-target concepts more faithfully. Requiring only a few seconds to apply, it offers a lightweight and drop-in tool for controlled model editing, advancing the goal of safer and more responsible generative models.
LGMay 31, 2023
DiffLoad: Uncertainty Quantification in Electrical Load Forecasting with the Diffusion ModelZhixian Wang, Qingsong Wen, Chaoli Zhang et al.
Electrical load forecasting plays a crucial role in decision-making for power systems, including unit commitment and economic dispatch. The integration of renewable energy sources and the occurrence of external events, such as the COVID-19 pandemic, have rapidly increased uncertainties in load forecasting. The uncertainties in load forecasting can be divided into two types: epistemic uncertainty and aleatoric uncertainty. Separating these types of uncertainties can help decision-makers better understand where and to what extent the uncertainty is, thereby enhancing their confidence in the following decision-making. This paper proposes a diffusion-based Seq2Seq structure to estimate epistemic uncertainty and employs the robust additive Cauchy distribution to estimate aleatoric uncertainty. Our method not only ensures the accuracy of load forecasting but also demonstrates the ability to separate the two types of uncertainties and be applicable to different levels of loads. The relevant code can be found at \url{https://anonymous.4open.science/r/DiffLoad-4714/}.