98.9LGJun 2Code
Exploiting Verification-Generation Gap: Test-Time Reinforcement Learning with Confidence-Conditioned VerificationJiahui Li, Jianfeng Shan, Wenpei Chen et al.
Test-time reinforcement learning has emerged as a promising paradigm for enhancing the complex reasoning abilities of large language models in a completely label-free manner. Despite existing studies focusing on Pass@1 performance, optimizing Pass@k remains under-explored yet critical in label-free settings, which measures generation coverage for sustained exploration. Optimizing Pass@k in label-free setting is highly non-trivial, as directly applying the Pass@k advantage designs effective for RLVR yields unsatisfactory performance. Through in-depth empirical analysis, we discover the root causes hindering performance: pseudo-label estimations for low-confidence samples have a high probability of being incorrect, while candidate answers for high-confidence samples suffer from severe diversity collapse. To overcome these hurdles, we propose TTRL-CoCoV (Test-Time Reinforcement Learning with Confidence-Conditioned Verification), a novel confidence-adaptive framework that expands Pass@k coverage and improves Pass@1 performance. Based on our key insight that verification capability generally leads generation capability, TTRL-CoCoV employs a confidence-conditioned mechanism: for high-confidence samples, it bootstraps verifier and applies an exploration-enhancing reward to prevent diversity collapse; for low-confidence samples, it delegates pseudo-label selection to the verifier to filter incorrect pseudo-labels; and for medium-confidence samples, it bypasses verification entirely. Extensive experiments demonstrate that TTRL-CoCoV outperforms the best competing methods across 6 widely-recognized benchmarks, achieves average absolute gains of +9.8% in Pass@1 and +18.7% in Pass@16 over TTRL, and even achieves absolute Pass@1 improvements of up to +5.0% across multiple reasoning benchmarks when compared against fully supervised RL methods. Our code repository: https://github.com/shanjf666/CoCoV.
67.9AIJun 2
TSQAgent: Rating Time Series Data Quality via Dedicated Agentic ReasoningShunyu Wu, Dan Li, Haozheng Ye et al.
Assessing the quality of time series (TS) data is fundamental yet inherently challenging due to the multifaceted nature of quality dimensions. Recently, large language models (LLMs) have emerged as a promising paradigm for TS quality assessment via pairwise comparison and per-dimension evaluation. However, existing approaches rely on manually predefined quality dimensions and purely text-based reasoning, leaving it unknown whether LLMs can identify truly relevant quality dimensions or perform grounded and quantitative quality comparisons. To investigate this, we construct TSQBench, a dedicated benchmark for evaluating LLMs on two progressive capabilities: (i) understanding and identifying relevant quality dimensions, and (ii) performing quality comparison under specific dimensions. Our analysis reveals that current LLMs consistently struggle with both dimension identification and evidence-grounded quality comparison. To address these limitations, we propose TSQAgent, a novel agentic reasoning framework for TS quality rating consisting of three collaborative roles: Perceiver for focused dimension selection, Inspector for dimension-wise quantitative analysis, and Adjudicator that aggregates and refines the final judgment. In particular, we introduce an agentic reasoning strategy that instills the ability to identify and prioritize the most relevant quality dimensions, and further propose an agent workflow equipped with external analytical tools to enable precise quantitative comparisons over selected dimensions. Experiments on both the proposed benchmark and eleven real-world datasets demonstrate that our framework not only substantially improves LLMs' capabilities in quality understanding and quantitative comparison but also effectively translates these improvements into better quality-aware data selection, leading to enhanced downstream performance and data efficiency.
LGMay 7, 2022
Decentral and Incentivized Federated Learning Frameworks: A Systematic Literature ReviewLeon Witt, Mathis Heyer, Kentaroh Toyoda et al.
The advent of Federated Learning (FL) has ignited a new paradigm for parallel and confidential decentralized Machine Learning (ML) with the potential of utilizing the computational power of a vast number of IoT, mobile and edge devices without data leaving the respective device, ensuring privacy by design. Yet, in order to scale this new paradigm beyond small groups of already entrusted entities towards mass adoption, the Federated Learning Framework (FLF) has to become (i) truly decentralized and (ii) participants have to be incentivized. This is the first systematic literature review analyzing holistic FLFs in the domain of both, decentralized and incentivized federated learning. 422 publications were retrieved, by querying 12 major scientific databases. Finally, 40 articles remained after a systematic review and filtering process for in-depth examination. Although having massive potential to direct the future of a more distributed and secure AI, none of the analyzed FLF is production-ready. The approaches vary heavily in terms of use-cases, system design, solved issues and thoroughness. We are the first to provide a systematic approach to classify and quantify differences between FLF, exposing limitations of current works and derive future directions for research in this novel domain.
NAJun 20, 2018
Superconvergence of the Gradient Approximation for Weak Galerkin Finite Element Methods on Nonuniform Rectangular PartitionsDan Li, Chunmei Wang, Junping Wang
This article presents a superconvergence for the gradient approximation of the second order elliptic equation discretized by the weak Galerkin finite element methods on nonuniform rectangular partitions. The result shows a convergence of ${\cal O}(h^r)$, $1.5\leq r \leq 2$, for the numerical gradient obtained from the lowest order weak Galerkin element consisting of piecewise linear and constant functions. For this numerical scheme, the optimal order of error estimate is ${\cal O}(h)$ for the gradient approximation. The superconvergence reveals a superior performance of the weak Galerkin finite element methods. Some computational results are included to numerically validate the superconvergence theory.
CLSep 11, 2023
An Empirical Study of NetOps Capability of Pre-Trained Large Language ModelsYukai Miao, Yu Bai, Li Chen et al.
Nowadays, the versatile capabilities of Pre-trained Large Language Models (LLMs) have attracted much attention from the industry. However, some vertical domains are more interested in the in-domain capabilities of LLMs. For the Networks domain, we present NetEval, an evaluation set for measuring the comprehensive capabilities of LLMs in Network Operations (NetOps). NetEval is designed for evaluating the commonsense knowledge and inference ability in NetOps in a multi-lingual context. NetEval consists of 5,732 questions about NetOps, covering five different sub-domains of NetOps. With NetEval, we systematically evaluate the NetOps capability of 26 publicly available LLMs. The results show that only GPT-4 can achieve a performance competitive to humans. However, some open models like LLaMA 2 demonstrate significant potential.
LGApr 23, 2022
A Novel Splitting Criterion Inspired by Geometric Mean Metric Learning for Decision TreeDan Li, Songcan Chen
Decision tree (DT) attracts persistent research attention due to its impressive empirical performance and interpretability in numerous applications. However, the growth of traditional yet widely-used univariate decision trees (UDTs) is quite time-consuming as they need to traverse all the features to find the splitting value with the maximal reduction of the impurity at each internal node. In this paper, we newly design a splitting criterion to speed up the growth. The criterion is induced from Geometric Mean Metric Learning (GMML) and then optimized under its diagonalized metric matrix constraint, consequently, a closed-form rank of feature discriminant abilities can at once be obtained and the top 1 feature at each node used to grow an intent DT (called as dGMML-DT, where d is an abbreviation for diagonalization). We evaluated the performance of the proposed methods and their corresponding ensembles on benchmark datasets. The experiment shows that dGMML-DT achieves comparable or better classification results more efficiently than the UDTs with 10x average speedup. Furthermore, dGMML-DT can straightforwardly be extended to its multivariable counterpart (dGMML-MDT) without needing laborious operations.
20.8CVMay 24
DA-UCT: Self-Supervised Domain-Adaptive Ultrasound Computed Tomography for Rapid Musculoskeletal Sound Speed ReconstructionTianyu Liu, Heyu Ma, Aiduo Wang et al.
Ultrasound computed tomography (UCT) via full waveform inversion (FWI) enables high-resolution quantitative imaging for tissue characterization and disease diagnosis. However, UCT suffers from large computational burden and severe convergence issues due to highly nonlinear optimization. Deep learning can accelerate UCT reconstruction, but supervised training requires large-scale labeled datasets difficult to obtain in vivo. To address these limitations, we propose SDA-UCT, a two-stage self-supervised domain-adaptive framework for rapid and accurate UCT imaging of musculoskeletal tissues. SDA-UCT employs an attention-enhanced network (AttUCT) pre-trained on simulation datasets and transfers to in-vivo data via physics-informed self-supervised learning, effectively bridging the simulation-to-real domain gap. A Low-Rank Adaptation (LoRA) mechanism is integrated to enable efficient adaptation across diverse clinical scenarios. Results showed that AttUCT achieved high-quality SOS reconstruction for simulated human forearm with a PSNR of 29.23 dB and SSIM of 0.928, outperforming conventional FWI and existing deep learning methods. Validated on in-vivo data, SDA-UCT successfully reconstructed SOS images revealing complex anatomical structures (skin, fat, muscle, tendon, bone and bone marrow) for human forearm, in high concordance with MRI references. The LoRA mechanism adjusting only 3% of parameters achieved comparable performance to full fine-tuning. The rapid reconstruction (5 ms per frame) enables real-time 3D visualization, achieving five-orders-of-magnitude improvement over traditional FWI. This work represents the first self-supervised domain-adaptive deep learning for rapid, high-resolution in-vivo UCT imaging, showing potential for musculoskeletal disease diagnosis.
36.7AIMay 7
Time Series Reasoning via Process-Verifiable Thinking Data Synthesis and Scheduling for Tailored LLM ReasoningJiahui Zhou, Dan Li, Boxin Li et al.
Time series is a pervasive data type across various application domains, rendering the reasonable solving of diverse time series tasks a long-standing goal. Recent advances in large language models (LLMs), especially their reasoning abilities unlocked through reinforcement learning (RL), have opened new opportunities for tackling tasks with long Chain-of-Thought (CoT) reasoning. However, leveraging LLM reasoning for time series remains in its infancy, hindered by the absence of carefully curated time series CoT data for training, limited data efficiency caused by underexplored data scheduling, and the lack of RL algorithms tailored for exploiting such time series CoT data. In this paper, we introduce VeriTime, a framework that tailors LLMs for time series reasoning through data synthesis, data scheduling, and RL training. First, we propose a data synthesis pipeline that constructs a TS-text multimodal dataset with process-verifiable annotations. Second, we design a data scheduling mechanism that arranges training samples according to a principled hierarchy of difficulty and task taxonomy. Third, we develop a two-stage reinforcement finetuning featuring fine-grained, multi-objective rewards that leverage verifiable process-level CoT data. Extensive experiments show that VeriTime substantially boosts LLM performance across diverse time series reasoning tasks. Notably, it enables compact 3B, 4B models to achieve reasoning capabilities on par with or exceeding those of larger proprietary LLMs.
98.4CVMay 4Code
Mamoda2.5: Enhancing Unified Multimodal Model with DiT-MoEYangming Shi, Shixiang Zhu, Tao Shen et al.
We present Mamoda2.5, a unified AR-Diffusion framework that seamlessly integrates multimodal understanding and generation within a single architecture. To efficiently enhance the model's generation capability, we equip the Diffusion Transformer backbone with a fine-grained Mixture-of-Experts (MoE) design (128 experts, Top-8 routing), yielding a 25B-parameter model that activates only 3B parameters, significantly reducing training costs while scaling up the model capacity. Mamoda2.5 achieves top-tier generation performance on VBench 2.0 and sets a new record in video editing quality, surpassing evaluated open-source models and matching the performance of current top-tier proprietary models, including the Kling O1 on OpenVE-Bench. Furthermore, we introduce a joint few-step distillation and reinforcement learning framework that compresses the 30-step editing model into a 4-step model and greatly accelerates model inference. Compared to open-source baselines, Mamoda2.5 achieves up to $95.9\times$ faster video editing inference. In real-world applications, Mamoda2.5 has been successfully deployed for content moderation and creative restoration tasks in advertising scenarios, achieving a 98% success rate in internal advertising video editing scenario.
ROJun 24, 2022
Eco-driving for Electric Connected Vehicles at Signalized Intersections: A Parameterized Reinforcement Learning approachXia Jiang, Jian Zhang, Dan Li
This paper proposes an eco-driving framework for electric connected vehicles (CVs) based on reinforcement learning (RL) to improve vehicle energy efficiency at signalized intersections. The vehicle agent is specified by integrating the model-based car-following policy, lane-changing policy, and the RL policy, to ensure safe operation of a CV. Subsequently, a Markov Decision Process (MDP) is formulated, which enables the vehicle to perform longitudinal control and lateral decisions, jointly optimizing the car-following and lane-changing behaviors of the CVs in the vicinity of intersections. Then, the hybrid action space is parameterized as a hierarchical structure and thereby trains the agents with two-dimensional motion patterns in a dynamic traffic environment. Finally, our proposed methods are evaluated in SUMO software from both a single-vehicle-based perspective and a flow-based perspective. The results show that our strategy can significantly reduce energy consumption by learning proper action schemes without any interruption of other human-driven vehicles (HDVs).
LGOct 10, 2023
A Variational Autoencoder Framework for Robust, Physics-Informed Cyberattack Recognition in Industrial Cyber-Physical SystemsNavid Aftabi, Dan Li, Paritosh Ramanan
Cybersecurity of Industrial Cyber-Physical Systems is drawing significant concerns as data communication increasingly leverages wireless networks. A lot of data-driven methods were develope for detecting cyberattacks, but few are focused on distinguishing them from equipment faults. In this paper, we develop a data-driven framework that can be used to detect, diagnose, and localize a type of cyberattack called covert attacks on networked industrial control systems. The framework has a hybrid design that combines a variational autoencoder (VAE), a recurrent neural network (RNN), and a Deep Neural Network (DNN). This data-driven framework considers the temporal behavior of a generic physical system that extracts features from the time series of the sensor measurements that can be used for detecting covert attacks, distinguishing them from equipment faults, as well as localize the attack/fault. We evaluate the performance of the proposed method through a realistic simulation study on a networked power transmission system as a typical example of ICS. We compare the performance of the proposed method with the traditional model-based method to show its applicability and efficacy.
AIDec 2, 2024Code
FD-LLM: Large Language Model for Fault Diagnosis of MachinesHamzah A. A. M. Qaid, Bo Zhang, Dan Li et al.
Large language models (LLMs) are effective at capturing complex, valuable conceptual representations from textual data for a wide range of real-world applications. However, in fields like Intelligent Fault Diagnosis (IFD), incorporating additional sensor data-such as vibration signals, temperature readings, and operational metrics-is essential but it is challenging to capture such sensor data information within traditional text corpora. This study introduces a novel IFD approach by effectively adapting LLMs to numerical data inputs for identifying various machine faults from time-series sensor data. We propose FD-LLM, an LLM framework specifically designed for fault diagnosis by formulating the training of the LLM as a multi-class classification problem. We explore two methods for encoding vibration signals: the first method uses a string-based tokenization technique to encode vibration signals into text representations, while the second extracts statistical features from both the time and frequency domains as statistical summaries of each signal. We assess the fault diagnosis capabilities of four open-sourced LLMs based on the FD-LLM framework, and evaluate the models' adaptability and generalizability under various operational conditions and machine components, namely for traditional fault diagnosis, cross-operational conditions, and cross-machine component settings. Our results show that LLMs such as Llama3 and Llama3-instruct demonstrate strong fault detection capabilities and significant adaptability across different operational conditions, outperforming state-of-the-art deep learning (DL) approaches in many cases.
LGSep 19, 2023
Corporate Credit Rating: A SurveyBojing Feng, Xi Cheng, Dan Li et al.
Corporate credit rating (CCR) plays a very important role in the process of contemporary economic and social development. How to use credit rating methods for enterprises has always been a problem worthy of discussion. Through reading and studying the relevant literature at home and abroad, this paper makes a systematic survey of CCR. This paper combs the context of the development of CCR methods from the three levels: statistical models, machine learning models and neural network models, summarizes the common databases of CCR, and deeply compares the advantages and disadvantages of the models. Finally, this paper summarizes the problems existing in the current research and prospects the future of CCR. Compared with the existing review of CCR, this paper expounds and analyzes the progress of neural network model in this field in recent years.
73.9LGApr 12
WaveMoE: A Wavelet-Enhanced Mixture-of-Experts Foundation Model for Time Series ForecastingShunyu Wu, Jiawei Huang, Weibin Feng et al.
Time series foundation models (TSFMs) have recently achieved remarkable success in universal forecasting by leveraging large-scale pretraining on diverse time series data. Complementing this progress, incorporating frequency-domain information yields promising performance in enhancing the modeling of complex temporal patterns, such as periodicity and localized high-frequency dynamics, which are prevalent in real-world time series. To advance this direction, we propose a new perspective that integrates explicit frequency-domain representations into scalable foundation models, and introduce WaveMoE, a wavelet-enhanced mixture-of-experts foundation model for time series forecasting. WaveMoE adopts a dual-path architecture that jointly processes time series tokens and wavelet tokens aligned along a unified temporal axis, and coordinates them through a shared expert routing mechanism that enables consistent expert specialization while efficiently scaling model capacity. Preliminary experimental results on 16 diverse benchmark datasets indicate that WaveMoE has the potential to further improve forecasting performance by incorporating wavelet-domain corpora.
57.7LGMar 24
MsFormer: Enabling Robust Predictive Maintenance Services for Industrial DevicesJiahui Zhou, Dan Li, Ruibing Jin et al.
Providing reliable predictive maintenance is a critical industrial AI service essential for ensuring the high availability of manufacturing devices. Existing deep-learning methods present competitive results on such tasks but lack a general service-oriented framework to capture complex dependencies in industrial IoT sensor data. While Transformer-based models show strong sequence modeling capabilities, their direct deployment as robust AI services faces significant bottlenecks. Specifically, streaming sensor data collected in real-world service environments often exhibits multi-scale temporal correlations driven by machine working principles. Besides, the datasets available for training time-to-failure predictive services are typically limited in size. These issues pose significant challenges for directly applying existing models as robust predictive services. To address these challenges, we propose MsFormer, a lightweight Multi-scale Transformer designed as a unified AI service model for reliable industrial predictive maintenance. MsFormer incorporates a Multi-scale Sampling (MS) module and a tailored position encoding mechanism to capture sequential correlations across multi-streaming service data. Additionally, to accommodate data-scarce service environments, MsFormer adopts a lightweight attention mechanism with straightforward pooling operations instead of self-attention. Extensive experiments on real-world datasets demonstrate that the proposed framework achieves significant performance improvements over state-of-the-art methods. Furthermore, MsFormer outperforms across industrial devices and operating conditions, demonstrating strong generalizability while maintaining a highly reliable Quality of Service (QoS).
CVMar 11, 2025Code
Using Powerful Prior Knowledge of Diffusion Model in Deep Unfolding Networks for Image Compressive SensingChen Liao, Yan Shen, Dan Li et al.
Recently, Deep Unfolding Networks (DUNs) have achieved impressive reconstruction quality in the field of image Compressive Sensing (CS) by unfolding iterative optimization algorithms into neural networks. The reconstruction quality of DUNs depends on the learned prior knowledge, so introducing stronger prior knowledge can further improve reconstruction quality. On the other hand, pre-trained diffusion models contain powerful prior knowledge and have a solid theoretical foundation and strong scalability, but it requires a large number of iterative steps to achieve reconstruction. In this paper, we propose to use the powerful prior knowledge of pre-trained diffusion model in DUNs to achieve high-quality reconstruction with less steps for image CS. Specifically, we first design an iterative optimization algorithm named Diffusion Message Passing (DMP), which embeds a pre-trained diffusion model into each iteration process of DMP. Then, we deeply unfold the DMP algorithm into a neural network named DMP-DUN. The proposed DMP-DUN can use lightweight neural networks to achieve mapping from measurement data to the intermediate steps of the reverse diffusion process and directly approximate the divergence of the diffusion model, thereby further improving reconstruction efficiency. Extensive experiments show that our proposed DMP-DUN achieves state-of-the-art performance and requires at least only 2 steps to reconstruct the image. Codes are available at https://github.com/FengodChen/DMP-DUN-CVPR2025.
AINov 6, 2025
DMA: Online RAG Alignment with Human FeedbackYu Bai, Yukai Miao, Dawei Wang et al.
Retrieval-augmented generation (RAG) systems often rely on static retrieval, limiting adaptation to evolving intent and content drift. We introduce Dynamic Memory Alignment (DMA), an online learning framework that systematically incorporates multi-granularity human feedback to align ranking in interactive settings. DMA organizes document-, list-, and response-level signals into a coherent learning pipeline: supervised training for pointwise and listwise rankers, policy optimization driven by response-level preferences, and knowledge distillation into a lightweight scorer for low-latency serving. Throughout this paper, memory refers to the model's working memory, which is the entire context visible to the LLM for In-Context Learning. We adopt a dual-track evaluation protocol mirroring deployment: (i) large-scale online A/B ablations to isolate the utility of each feedback source, and (ii) few-shot offline tests on knowledge-intensive benchmarks. Online, a multi-month industrial deployment further shows substantial improvements in human engagement. Offline, DMA preserves competitive foundational retrieval while yielding notable gains on conversational QA (TriviaQA, HotpotQA). Taken together, these results position DMA as a principled approach to feedback-driven, real-time adaptation in RAG without sacrificing baseline capability.
SESep 19, 2025Code
Generating High-Quality Datasets for Code Editing via Open-Source Language ModelsZekai Zhang, Mingwei Liu, Zhenxi Chen et al.
Code editing plays a vital role in software engineering, requiring developers to adjust existing code according to natural language instructions while keeping functionality intact and avoiding unnecessary modifications. However, commit-based datasets commonly used for this task are often noisy, lack diversity, and fail to reflect the style of real-world edit instructions. To address this, we introduce OpenCodeEdit, an open-source pipeline that leverages multiple LLMs to synthesize realistic code-edit triplets. The pipeline produces both concise "lazy" instructions and more detailed "descriptive" ones, and applies filtering based on diffs and topics to guarantee data quality and variety. Using this process, we construct OCEDataFT, a curated dataset of 20K samples. Fine-tuning three advanced base models on OCEDataFT leads to significant performance boosts on the CanItEdit benchmark, with relative pass@1 improvements ranging from 4.50% to 20.79%. Notably, the resulting models achieve performance close to closed-source systems, narrowing the gap to GPT-4 to just 3.54%, without relying on proprietary resources or manual annotation.
AIAug 22, 2025Code
Integrating Time Series into LLMs via Multi-layer Steerable Embedding Fusion for Enhanced ForecastingZhuomin Chen, Dan Li, Jiahui Zhou et al.
Time series (TS) data are ubiquitous across various application areas, rendering time series forecasting (TSF) a fundamental task. With the astounding advances in large language models (LLMs), a variety of methods have been developed to adapt LLMs for time series forecasting. Despite unlocking the potential of LLMs in comprehending TS data, existing methods are inherently constrained by their shallow integration of TS information, wherein LLMs typically access TS representations at shallow layers, primarily at the input layer. This causes the influence of TS representations to progressively fade in deeper layers and eventually leads to ineffective adaptation between textual embeddings and TS representations. In this paper, we propose the Multi-layer Steerable Embedding Fusion (MSEF), a novel framework that enables LLMs to directly access time series patterns at all depths, thereby mitigating the progressive loss of TS information in deeper layers. Specifically, MSEF leverages off-the-shelf time series foundation models to extract semantically rich embeddings, which are fused with intermediate text representations across LLM layers via layer-specific steering vectors. These steering vectors are designed to continuously optimize the alignment between time series and textual modalities and facilitate a layer-specific adaptation mechanism that ensures efficient few-shot learning capabilities. Experimental results on seven benchmarks demonstrate significant performance improvements by MSEF compared with baselines, with an average reduction of 31.8% in terms of MSE. The code is available at https://github.com/One1sAll/MSEF.
CVSep 23, 2021Code
Paint4Poem: A Dataset for Artistic Visualization of Classical Chinese PoemsDan Li, Shuai Wang, Jie Zou et al.
In this work we propose a new task: artistic visualization of classical Chinese poems, where the goal is to generatepaintings of a certain artistic style for classical Chinese poems. For this purpose, we construct a new dataset called Paint4Poem. Thefirst part of Paint4Poem consists of 301 high-quality poem-painting pairs collected manually from an influential modern Chinese artistFeng Zikai. As its small scale poses challenges for effectively training poem-to-painting generation models, we introduce the secondpart of Paint4Poem, which consists of 3,648 caption-painting pairs collected manually from Feng Zikai's paintings and 89,204 poem-painting pairs collected automatically from the web. We expect the former to help learning the artist painting style as it containshis most paintings, and the latter to help learning the semantic relevance between poems and paintings. Further, we analyze Paint4Poem regarding poem diversity, painting style, and the semantic relevance between poems and paintings. We create abenchmark for Paint4Poem: we train two representative text-to-image generation models: AttnGAN and MirrorGAN, and evaluate theirperformance regarding painting pictorial quality, painting stylistic relevance, and semantic relevance between poems and paintings.The results indicate that the models are able to generate paintings that have good pictorial quality and mimic Feng Zikai's style, but thereflection of poem semantics is limited. The dataset also poses many interesting research directions on this task, including transferlearning, few-shot learning, text-to-image generation for low-resource data etc. The dataset is publicly available.(https://github.com/paint4poem/paint4poem)
LGJul 31, 2021Code
Missingness Augmentation: A General Approach for Improving Generative Imputation ModelsYufeng Wang, Dan Li, Cong Xu et al.
Missing data imputation is a fundamental problem in data analysis, and many studies have been conducted to improve its performance by exploring model structures and learning procedures. However, data augmentation, as a simple yet effective method, has not received enough attention in this area. In this paper, we propose a novel data augmentation method called Missingness Augmentation (MisA) for generative imputation models. Our approach dynamically produces incomplete samples at each epoch by utilizing the generator's output, constraining the augmented samples using a simple reconstruction loss, and combining this loss with the original loss to form the final optimization objective. As a general augmentation technique, MisA can be easily integrated into generative imputation frameworks, providing a simple yet effective way to enhance their performance. Experimental results demonstrate that MisA significantly improves the performance of many recently proposed generative imputation models on a variety of tabular and image datasets. The code is available at \url{https://github.com/WYu-Feng/Missingness-Augmentation}.
LGNov 16, 2020Code
PC-GAIN: Pseudo-label Conditional Generative Adversarial Imputation Networks for Incomplete DataYufeng Wang, Dan Li, Xiang Li et al.
Datasets with missing values are very common in real world applications. GAIN, a recently proposed deep generative model for missing data imputation, has been proved to outperform many state-of-the-art methods. But GAIN only uses a reconstruction loss in the generator to minimize the imputation error of the non-missing part, ignoring the potential category information which can reflect the relationship between samples. In this paper, we propose a novel unsupervised missing data imputation method named PC-GAIN, which utilizes potential category information to further enhance the imputation power. Specifically, we first propose a pre-training procedure to learn potential category information contained in a subset of low-missing-rate data. Then an auxiliary classifier is determined using the synthetic pseudo-labels. Further, this classifier is incorporated into the generative adversarial framework to help the generator to yield higher quality imputation results. The proposed method can improve the imputation quality of GAIN significantly. Experimental results on various benchmark datasets show that our method is also superior to other baseline approaches. Our code is available at \url{https://github.com/WYu-Feng/pc-gain}.
68.5AIMay 10
Empowering VLMs for Few-Shot Multimodal Time Series Classification via Tailored Agentic ReasoningLin Li, Jiawei Huang, Qihao Quan et al.
In this paper, we propose the first VL$\underline{\textbf{M}}$ $\underline{\textbf{a}}$gentic $\underline{\textbf{r}}$easoning framework for few-$\underline{\textbf{s}}$hot multimodal $\underline{\textbf{T}}$ime $\underline{\textbf{S}}$eries $\underline{\textbf{C}}$lassification ($\textbf{MarsTSC}$), which introduces a self-evolving knowledge bank as a dynamic context iteratively refined via reflective agentic reasoning. The framework comprises three collaborative roles: i) Generator conducts reliable classification via reasoning; ii) Reflector diagnoses the root causes of reasoning errors to yield discriminative insights targeting the temporal features overlooked by Generator; iii) Modifier applies verified updates to the knowledge bank to prevent context collapse. We further introduce a test-time update strategy to enable cautious, continuous knowledge bank refinement to mitigate few-shot bias and distribution shift. Extensive experiments across 12 mainstream time series benchmarks demonstrate that $\textbf{MarsTSC}$ delivers substantial and consistent performance gains across 6 VLM backbones, outperforming both classical and foundation model-based time series baselines under few-shot conditions, while producing interpretable rationales that ground each classification decision in human-readable feature evidence.
45.1CRMar 30
Democratizing Federated Learning with Blockchain and Multi-Task Peer PredictionLeon Witt, Kentaroh Toyoda, Wojciech Samek et al.
The synergy between Federated Learning and blockchain has been considered promising; however, the computationally intensive nature of contribution measurement conflicts with the strict computation and storage limits of blockchain systems. We propose a novel concept to decentralize the AI training process using blockchain technology and Multi-task Peer Prediction. By leveraging smart contracts and cryptocurrencies to incentivize contributions to the training process, we aim to harness the mutual benefits of AI and blockchain. We discuss the advantages and limitations of our design.
LGOct 2, 2023
Adversarial Client Detection via Non-parametric Subspace Monitoring in the Internet of Federated ThingsXianjian Xie, Xiaochen Xian, Dan Li et al.
The Internet of Federated Things (IoFT) represents a network of interconnected systems with federated learning as the backbone, facilitating collaborative knowledge acquisition while ensuring data privacy for individual systems. The wide adoption of IoFT, however, is hindered by security concerns, particularly the susceptibility of federated learning networks to adversarial attacks. In this paper, we propose an effective non-parametric approach FedRR, which leverages the low-rank features of the transmitted parameter updates generated by federated learning to address the adversarial attack problem. Besides, our proposed method is capable of accurately detecting adversarial clients and controlling the false alarm rate under the scenario with no attack occurring. Experiments based on digit recognition using the MNIST datasets validated the advantages of our approach.
LGJan 15
Understanding and Preserving Safety in Fine-Tuned LLMsJiawen Zhang, Yangfan Hu, Kejia Chen et al.
Fine-tuning is an essential and pervasive functionality for applying large language models (LLMs) to downstream tasks. However, it has the potential to substantially degrade safety alignment, e.g., by greatly increasing susceptibility to jailbreak attacks, even when the fine-tuning data is entirely harmless. Despite garnering growing attention in defense efforts during the fine-tuning stage, existing methods struggle with a persistent safety-utility dilemma: emphasizing safety compromises task performance, whereas prioritizing utility typically requires deep fine-tuning that inevitably leads to steep safety declination. In this work, we address this dilemma by shedding new light on the geometric interaction between safety- and utility-oriented gradients in safety-aligned LLMs. Through systematic empirical analysis, we uncover three key insights: (I) safety gradients lie in a low-rank subspace, while utility gradients span a broader high-dimensional space; (II) these subspaces are often negatively correlated, causing directional conflicts during fine-tuning; and (III) the dominant safety direction can be efficiently estimated from a single sample. Building upon these novel insights, we propose safety-preserving fine-tuning (SPF), a lightweight approach that explicitly removes gradient components conflicting with the low-rank safety subspace. Theoretically, we show that SPF guarantees utility convergence while bounding safety drift. Empirically, SPF consistently maintains downstream task performance and recovers nearly all pre-trained safety alignment, even under adversarial fine-tuning scenarios. Furthermore, SPF exhibits robust resistance to both deep fine-tuning and dynamic jailbreak attacks. Together, our findings provide new mechanistic understanding and practical guidance toward always-aligned LLM fine-tuning.
SPNov 11, 2025
Toward Adaptive BCIs: Enhancing Decoding Stability via User State-Aware EEG FilteringYeon-Woo Choi, Hye-Bin Shin, Dan Li
Brain-computer interfaces (BCIs) often suffer from limited robustness and poor long-term adaptability. Model performance rapidly degrades when user attention fluctuates, brain states shift over time, or irregular artifacts appear during interaction. To mitigate these issues, we introduce a user state-aware electroencephalogram (EEG) filtering framework that refines neural representations before decoding user intentions. The proposed method continuously estimates the user's cognitive state (e.g., focus or distraction) from EEG features and filters unreliable segments by applying adaptive weighting based on the estimated attention level. This filtering stage suppresses noisy or out-of-focus epochs, thereby reducing distributional drift and improving the consistency of subsequent decoding. Experiments on multiple EEG datasets that emulate real BCI scenarios demonstrate that the proposed state-aware filtering enhances classification accuracy and stability across different user states and sessions compared with conventional preprocessing pipelines. These findings highlight that leveraging brain-derived state information--even without additional user labels--can substantially improve the reliability of practical EEG-based BCIs.
SEMar 31, 2024
Face It Yourselves: An LLM-Based Two-Stage Strategy to Localize Configuration Errors via LogsShiwen Shan, Yintong Huo, Yuxin Su et al.
Configurable software systems are prone to configuration errors, resulting in significant losses to companies. However, diagnosing these errors is challenging due to the vast and complex configuration space. These errors pose significant challenges for both experienced maintainers and new end-users, particularly those without access to the source code of the software systems. Given that logs are easily accessible to most end-users, we conduct a preliminary study to outline the challenges and opportunities of utilizing logs in localizing configuration errors. Based on the insights gained from the preliminary study, we propose an LLM-based two-stage strategy for end-users to localize the root-cause configuration properties based on logs. We further implement a tool, LogConfigLocalizer, aligned with the design of the aforementioned strategy, hoping to assist end-users in coping with configuration errors through log analysis. To the best of our knowledge, this is the first work to localize the root-cause configuration properties for end-users based on Large Language Models~(LLMs) and logs. We evaluate the proposed strategy on Hadoop by LogConfigLocalizer and prove its efficiency with an average accuracy as high as 99.91%. Additionally, we also demonstrate the effectiveness and necessity of different phases of the methodology by comparing it with two other variants and a baseline tool. Moreover, we validate the proposed methodology through a practical case study to demonstrate its effectiveness and feasibility.
CVApr 19, 2024
EfficientGS: Streamlining Gaussian Splatting for Large-Scale High-Resolution Scene RepresentationWenkai Liu, Tao Guan, Bin Zhu et al.
In the domain of 3D scene representation, 3D Gaussian Splatting (3DGS) has emerged as a pivotal technology. However, its application to large-scale, high-resolution scenes (exceeding 4k$\times$4k pixels) is hindered by the excessive computational requirements for managing a large number of Gaussians. Addressing this, we introduce 'EfficientGS', an advanced approach that optimizes 3DGS for high-resolution, large-scale scenes. We analyze the densification process in 3DGS and identify areas of Gaussian over-proliferation. We propose a selective strategy, limiting Gaussian increase to key primitives, thereby enhancing the representational efficiency. Additionally, we develop a pruning mechanism to remove redundant Gaussians, those that are merely auxiliary to adjacent ones. For further enhancement, we integrate a sparse order increment for Spherical Harmonics (SH), designed to alleviate storage constraints and reduce training overhead. Our empirical evaluations, conducted on a range of datasets including extensive 4K+ aerial images, demonstrate that 'EfficientGS' not only expedites training and rendering times but also achieves this with a model size approximately tenfold smaller than conventional 3DGS while maintaining high rendering fidelity.
LGNov 30, 2025
Bayesian dynamic scheduling of multipurpose batch processes under incomplete look-ahead informationTaicheng Zheng, Dan Li, Jie Li
Multipurpose batch processes become increasingly popular in manufacturing industries since they adapt to low-volume, high-value products and shifting demands. These processes often operate in a dynamic environment, which faces disturbances such as processing delays and demand changes. To minimise long-term cost and system nervousness (i.e., disruptive changes to schedules), schedulers must design rescheduling strategies to address such disturbances effectively. Existing methods often assume complete look-ahead information over the scheduling horizon. This assumption contrasts with realistic situations where schedulers can only access incomplete look-ahead information. Sticking with existing methods may lead to suboptimal long-term costs and high-level system nervousness. In this work we propose a Bayesian dynamic scheduling method. Our method relies on learning a Bayesian Network from the probability distribution of disturbances. Specifically, the Bayesian Network represents how likely each operation will be impacted by disturbances. During the online execution, when new disturbances become observed, this method updates the posterior distribution and therefore guides the rescheduling strategy. We compare our method with the existing periodic rescheduling strategy (which generates new schedules from scratch at fixed intervals) on four benchmark problems. Computational results show that our method achieves statistically better long-term costs and system nervousness. In the theoretical aspect, we prove that if disturbances are mutually independent, the impact-quantifying variables inherently satisfy the independence assumptions required by Bayesian Networks. As an implication, practitioners can extend the method to other scheduling problems (such as job shop scheduling and continuous processes), given that they define the problem-specific dependencies between operations.
LGNov 10, 2025
Lightweight Time Series Data Valuation on Time Series Foundation Models via In-Context FinetuningShunyu Wu, Tianyue Li, Yixuan Leng et al.
Time series foundation models (TSFMs) have demonstrated increasing capabilities due to their extensive pretraining on large volumes of diverse time series data. Consequently, the quality of time series data is crucial to TSFM performance, rendering an accurate and efficient data valuation of time series for TSFMs indispensable. However, traditional data valuation methods, such as influence functions, face severe computational bottlenecks due to their poor scalability with growing TSFM model sizes and often fail to preserve temporal dependencies. In this paper, we propose LTSV, a Lightweight Time Series Valuation on TSFMS via in-context finetuning. Grounded in the theoretical evidence that in-context finetuning approximates the influence function, LTSV estimates a sample's contribution by measuring the change in context loss after in-context finetuning, leveraging the strong generalization capabilities of TSFMs to produce robust and transferable data valuations. To capture temporal dependencies, we introduce temporal block aggregation, which integrates per-block influence scores across overlapping time windows. Experiments across multiple time series datasets and models demonstrate that LTSV consistently provides reliable and strong valuation performance, while maintaining manageable computational requirements. Our results suggest that in-context finetuning on time series foundation models provides a practical and effective bridge between data attribution and model generalization in time series learning.
CVOct 29, 2024
EEG-based Multimodal Representation Learning for Emotion RecognitionKang Yin, Hye-Bin Shin, Dan Li et al.
Multimodal learning has been a popular area of research, yet integrating electroencephalogram (EEG) data poses unique challenges due to its inherent variability and limited availability. In this paper, we introduce a novel multimodal framework that accommodates not only conventional modalities such as video, images, and audio, but also incorporates EEG data. Our framework is designed to flexibly handle varying input sizes, while dynamically adjusting attention to account for feature importance across modalities. We evaluate our approach on a recently introduced emotion recognition dataset that combines data from three modalities, making it an ideal testbed for multimodal learning. The experimental results provide a benchmark for the dataset and demonstrate the effectiveness of the proposed framework. This work highlights the potential of integrating EEG into multimodal systems, paving the way for more robust and comprehensive applications in emotion recognition and beyond.
CRDec 10, 2025
FBA$^2$D: Frequency-based Black-box Attack for AI-generated Image DetectionXiaojing Chen, Dan Li, Lijun Peng et al.
The prosperous development of Artificial Intelligence-Generated Content (AIGC) has brought people's anxiety about the spread of false information on social media. Designing detectors for filtering is an effective defense method, but most detectors will be compromised by adversarial samples. Currently, most studies exposing AIGC security issues assume information on model structure and data distribution. In real applications, attackers query and interfere with models that provide services in the form of application programming interfaces (APIs), which constitutes the black-box decision-based attack paradigm. However, to the best of our knowledge, decision-based attacks on AIGC detectors remain unexplored. In this study, we propose \textbf{FBA$^2$D}: a frequency-based black-box attack method for AIGC detection to fill the research gap. Motivated by frequency-domain discrepancies between generated and real images, we develop a decision-based attack that leverages the Discrete Cosine Transform (DCT) for fine-grained spectral partitioning and selects frequency bands as query subspaces, improving both query efficiency and image quality. Moreover, attacks on AIGC detectors should mitigate initialization failures, preserve image quality, and operate under strict query budgets. To address these issues, we adopt an ``adversarial example soup'' method, averaging candidates from successive surrogate iterations and using the result as the initialization to accelerate the query-based attack. The empirical study on the Synthetic LSUN dataset and GenImage dataset demonstrate the effectiveness of our prosed method. This study shows the urgency of addressing practical AIGC security problems.
AIJun 1, 2025
Enhancing LLM Reasoning for Time Series Classification by Tailored Thinking and Fused DecisionJiahui Zhou, Dan Li, Lin Li et al.
The reasoning capabilities of large language models (LLMs) have significantly advanced their performance by enabling in-depth understanding of diverse tasks. With growing interest in applying LLMs to the time series domain, this has proven nontrivial, as evidenced by the limited efficacy of straightforwardly adapting text-domain reasoning techniques. Although recent work has shown promise in several time series tasks, further leveraging advancements in LLM reasoning remains under-explored for time series classification (TSC) tasks, despite their prevalence and significance in many real-world applications. In this paper, we propose ReasonTSC, a novel framework designed to effectively leverage LLM reasoning for time series classification through both a multi-turn reasoning and a fused decision-making strategy tailored to TSC. Rather than straightforwardly applying existing reasoning techniques or relying solely on LLMs' built-in reasoning capabilities, ReasonTSC first steers the model to think over the essential characteristics of time series data. Next, it integrates predictions and confidence scores from plug-in classifiers, e.g., domain-specific time series models, as in-context examples. Finally, ReasonTSC guides the LLM through a structured reasoning process: it evaluates the initial assessment, backtracks to consider alternative hypotheses, and compares their merits before arriving at a final classification. Extensive experiments and systematic ablation studies demonstrate that ReasonTSC consistently outperforms both existing time series reasoning baselines and plug-in models, and is even capable of identifying and correcting plug-in models' false predictions.
CRFeb 2, 2025
Activation Approximations Can Incur Safety Vulnerabilities Even in Aligned LLMs: Comprehensive Analysis and DefenseJiawen Zhang, Kejia Chen, Lipeng He et al.
Large Language Models (LLMs) have showcased remarkable capabilities across various domains. Accompanying the evolving capabilities and expanding deployment scenarios of LLMs, their deployment challenges escalate due to their sheer scale and the advanced yet complex activation designs prevalent in notable model series, such as Llama, Gemma, Mistral. These challenges have become particularly pronounced in resource-constrained deployment scenarios, where mitigating inference bottlenecks is imperative. Among various recent efforts, activation approximation has emerged as a promising avenue for pursuing inference efficiency, sometimes considered indispensable in applications such as private inference. Despite achieving substantial speedups with minimal impact on utility, even appearing sound and practical for real-world deployment, the safety implications of activation approximations remain unclear. In this work, we fill this critical gap in LLM safety by conducting the first systematic safety evaluation of activation approximations. Our safety vetting spans seven state-of-the-art techniques across three popular categories (activation polynomialization, activation sparsification, and activation quantization), revealing consistent safety degradation across ten safety-aligned LLMs. To overcome the hurdle of devising a unified defense accounting for diverse activation approximation methods, we perform an in-depth analysis of their shared error patterns and uncover three key findings. We propose QuadA, a novel safety enhancement method tailored to mitigate the safety compromises introduced by activation approximations. Extensive experiments and ablation studies corroborate QuadA's effectiveness in enhancing the safety capabilities of LLMs after activation approximations.
CLApr 25, 2025
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic ReviewToghrul Abbasli, Kentaroh Toyoda, Yuan Wang et al.
Large Language Models (LLMs) have been transformative across many domains. However, hallucination -- confidently outputting incorrect information -- remains one of the leading challenges for LLMs. This raises the question of how to accurately assess and quantify the uncertainty of LLMs. Extensive literature on traditional models has explored Uncertainty Quantification (UQ) to measure uncertainty and employed calibration techniques to address the misalignment between uncertainty and accuracy. While some of these methods have been adapted for LLMs, the literature lacks an in-depth analysis of their effectiveness and does not offer a comprehensive benchmark to enable insightful comparison among existing solutions. In this work, we fill this gap via a systematic survey of representative prior works on UQ and calibration for LLMs and introduce a rigorous benchmark. Using two widely used reliability datasets, we empirically evaluate six related methods, which justify the significant findings of our review. Finally, we provide outlooks for key future directions and outline open challenges. To the best of our knowledge, this survey is the first dedicated study to review the calibration methods and relevant metrics for LLMs.
LGMar 14, 2025
BACE-RUL: A Bi-directional Adversarial Network with Covariate Encoding for Machine Remaining Useful Life PredictionZekai Zhang, Dan Li, Shunyu Wu et al.
Prognostic and Health Management (PHM) are crucial ways to avoid unnecessary maintenance for Cyber-Physical Systems (CPS) and improve system reliability. Predicting the Remaining Useful Life (RUL) is one of the most challenging tasks for PHM. Existing methods require prior knowledge about the system, contrived assumptions, or temporal mining to model the life cycles of machine equipment/devices, resulting in diminished accuracy and limited applicability in real-world scenarios. This paper proposes a Bi-directional Adversarial network with Covariate Encoding for machine Remaining Useful Life (BACE-RUL) prediction, which only adopts sensor measurements from the current life cycle to predict RUL rather than relying on previous consecutive cycle recordings. The current sensor measurements of mechanical devices are encoded to a conditional space to better understand the implicit inner mechanical status. The predictor is trained as a conditional generative network with the encoded sensor measurements as its conditions. Various experiments on several real-world datasets, including the turbofan aircraft engine dataset and the dataset collected from degradation experiments of Li-Ion battery cells, show that the proposed model is a general framework and outperforms state-of-the-art methods.
SEDec 30, 2024
LicenseGPT: A Fine-tuned Foundation Model for Publicly Available Dataset License ComplianceJingwen Tan, Gopi Krishnan Rajbahadur, Zi Li et al.
Dataset license compliance is a critical yet complex aspect of developing commercial AI products, particularly with the increasing use of publicly available datasets. Ambiguities in dataset licenses pose significant legal risks, making it challenging even for software IP lawyers to accurately interpret rights and obligations. In this paper, we introduce LicenseGPT, a fine-tuned foundation model (FM) specifically designed for dataset license compliance analysis. We first evaluate existing legal FMs (i.e., FMs specialized in understanding and processing legal texts) and find that the best-performing model achieves a Prediction Agreement (PA) of only 43.75%. LicenseGPT, fine-tuned on a curated dataset of 500 licenses annotated by legal experts, significantly improves PA to 64.30%, outperforming both legal and general-purpose FMs. Through an A/B test and user study with software IP lawyers, we demonstrate that LicenseGPT reduces analysis time by 94.44%, from 108 seconds to 6 seconds per license, without compromising accuracy. Software IP lawyers perceive LicenseGPT as a valuable supplementary tool that enhances efficiency while acknowledging the need for human oversight in complex cases. Our work underscores the potential of specialized AI tools in legal practice and offers a publicly available resource for practitioners and researchers.
AIMay 22, 2024
Blockchain and Artificial Intelligence: Synergies and ConflictsLeon Witt, Armando Teles Fortes, Kentaroh Toyoda et al.
Blockchain technology and Artificial Intelligence (AI) have emerged as transformative forces in their respective domains. This paper explores synergies and challenges between these two technologies. Our research analyses the biggest projects combining blockchain and AI, based on market capitalization, and derives a novel framework to categorize contemporary and future use cases. Despite the theoretical compatibility, current real-world applications combining blockchain and AI remain in their infancy.
AIJan 25
EntWorld: A Holistic Environment and Benchmark for Verifiable Enterprise GUI AgentsYing Mo, Yu Bai, Dapeng Sun et al.
Recent advances in Multimodal Large Language Models (MLLMs) have enabled agents to operate in open-ended web and operating system environments. However, existing benchmarks predominantly target consumer-oriented scenarios (e.g., e-commerce and travel booking), failing to capture the complexity and rigor of professional enterprise workflows. Enterprise systems pose distinct challenges, including high-density user interfaces, strict business logic constraints, and a strong reliance on precise, state-consistent information retrieval-settings in which current generalist agents often struggle. To address this gap, we introduce EntWorld, a large-scale benchmark consisting of 1,756 tasks across six representative enterprise domains, including customer relationship management (CRM), information technology infrastructure library (ITIL), and enterprise resource planning (ERP) systems. Unlike previous datasets that depend on fragile execution traces or extensive manual annotation, EntWorld adopts a schema-grounded task generation framework that directly reverse-engineers business logic from underlying database schemas, enabling the synthesis of realistic, long-horizon workflows. Moreover, we propose a SQL-based deterministic verification mechanism in building datasets that replaces ambiguous visual matching with rigorous state-transition validation. Experimental results demonstrate that state-of-the-art models (e.g., GPT-4.1) achieve 47.61% success rate on EntWorld, substantially lower than the human performance, highlighting a pronounced enterprise gap in current agentic capabilities and the necessity of developing domain-specific agents. We release EntWorld as a rigorous testbed to facilitate the development and evaluation of the next generation of enterprise-ready digital agents.
LGNov 24, 2025
Prototype-Guided Non-Exemplar Continual Learning for Cross-subject EEG DecodingDan Li, Hye-Bin Shin, Yeon-Woo Choi
Due to the significant variability in electroencephalo-gram (EEG) signals across individuals, knowledge acquired from previous subjects is often overwritten as new subjects are introduced in continual EEG decoding tasks. Existing methods mainly rely on storing historical data from seen subjects as replay buffers to mitigate forgetting, which is impractical under privacy or memory constraints. To address this issue, we propose a Prototype-guided Non-Exemplar Continual Learning (ProNECL) framework that preserves prior knowledge without accessing historical EEG samples. ProNECL summarizes subject-specific discriminative representations into class-level prototypes and incrementally aligns new subject representations with a global prototype memory through prototype-based feature regulariza-tion and cross-subject alignment. Experiments on the BCI Com-petition IV 2a and 2b datasets demonstrate that ProNECL effec-tively balances knowledge retention and adaptability, achieving superior performance in cross-subject continual EEG decoding tasks.
LGNov 5, 2025
Tensor-Efficient High-Dimensional Q-learningJunyi Wu, Dan Li
High-dimensional reinforcement learning faces challenges with complex calculations and low sample efficiency in large state-action spaces. Q-learning algorithms struggle particularly with the curse of dimensionality, where the number of state-action pairs grows exponentially with problem size. While neural network-based approaches like Deep Q-Networks have shown success, recent tensor-based methods using low-rank decomposition offer more parameter-efficient alternatives. Building upon existing tensor-based methods, we propose Tensor-Efficient Q-Learning (TEQL), which enhances low-rank tensor decomposition via improved block coordinate descent on discretized state-action spaces, incorporating novel exploration and regularization mechanisms. The key innovation is an exploration strategy that combines approximation error with visit count-based upper confidence bound to prioritize actions with high uncertainty, avoiding wasteful random exploration. Additionally, we incorporate a frequency-based penalty term in the objective function to encourage exploration of less-visited state-action pairs and reduce overfitting to frequently visited regions. Empirical results on classic control tasks demonstrate that TEQL outperforms conventional matrix-based methods and deep RL approaches in both sample efficiency and total rewards, making it suitable for resource-constrained applications, such as space and healthcare where sampling costs are high.
ROSep 13, 2025
ViSTR-GP: Online Cyberattack Detection via Vision-to-State Tensor Regression and Gaussian Processes in Automated Robotic OperationsNavid Aftabi, Philip Samaha, Jin Ma et al.
Industrial robotic systems are central to automating smart manufacturing operations. Connected and automated factories face growing cybersecurity risks that can potentially cause interruptions and damages to physical operations. Among these attacks, data-integrity attacks often involve sophisticated exploitation of vulnerabilities that enable an attacker to access and manipulate the operational data and are hence difficult to detect with only existing intrusion detection or model-based detection. This paper addresses the challenges in utilizing existing side-channels to detect data-integrity attacks in robotic manufacturing processes by developing an online detection framework, ViSTR-GP, that cross-checks encoder-reported measurements against a vision-based estimate from an overhead camera outside the controller's authority. In this framework, a one-time interactive segmentation initializes SAM-Track to generate per-frame masks. A low-rank tensor-regression surrogate maps each mask to measurements, while a matrix-variate Gaussian process models nominal residuals, capturing temporal structure and cross-joint correlations. A frame-wise test statistic derived from the predictive distribution provides an online detector with interpretable thresholds. We validate the framework on a real-world robotic testbed with synchronized video frame and encoder data, collecting multiple nominal cycles and constructing replay attack scenarios with graded end-effector deviations. Results on the testbed indicate that the proposed framework recovers joint angles accurately and detects data-integrity attacks earlier with more frequent alarms than all baselines. These improvements are most evident in the most subtle attacks. These results show that plants can detect data-integrity attacks by adding an independent physical channel, bypassing the controller's authority, without needing complex instrumentation.
SYAug 29, 2025
DynaMark: A Reinforcement Learning Framework for Dynamic Watermarking in Industrial Machine Tool ControllersNavid Aftabi, Abhishek Hanchate, Satish Bukkapatnam et al.
Industry 4.0's highly networked Machine Tool Controllers (MTCs) are prime targets for replay attacks that use outdated sensor data to manipulate actuators. Dynamic watermarking can reveal such tampering, but current schemes assume linear-Gaussian dynamics and use constant watermark statistics, making them vulnerable to the time-varying, partly proprietary behavior of MTCs. We close this gap with DynaMark, a reinforcement learning framework that models dynamic watermarking as a Markov decision process (MDP). It learns an adaptive policy online that dynamically adapts the covariance of a zero-mean Gaussian watermark using available measurements and detector feedback, without needing system knowledge. DynaMark maximizes a unique reward function balancing control performance, energy consumption, and detection confidence dynamically. We develop a Bayesian belief updating mechanism for real-time detection confidence in linear systems. This approach, independent of specific system assumptions, underpins the MDP for systems with linear dynamics. On a Siemens Sinumerik 828D controller digital twin, DynaMark achieves a reduction in watermark energy by 70% while preserving the nominal trajectory, compared to constant variance baselines. It also maintains an average detection delay equivalent to one sampling interval. A physical stepper-motor testbed validates these findings, rapidly triggering alarms with less control performance decline and exceeding existing benchmarks.
LGJun 2, 2025
TSRating: Rating Quality of Diverse Time Series Data by Meta-learning from LLM JudgmentShunyu Wu, Dan Li, Haozheng Ye et al.
High-quality time series (TS) data are essential for ensuring TS model performance, rendering research on rating TS data quality indispensable. Existing methods have shown promising rating accuracy within individual domains, primarily by extending data quality rating techniques such as influence functions and Shapley values to account for temporal characteristics. However, they neglect the fact that real-world TS data can span vastly different domains and exhibit distinct properties, hampering the accurate and efficient rating of diverse TS data. In this paper, we propose TSRating, a novel and unified framework for rating the quality of time series data crawled from diverse domains. TSRating is built on the assumption that LLMs inherit ample knowledge, acquired during their extensive pretraining, enabling them to comprehend and discern quality differences in diverse TS data. We verify this assumption by devising a series of prompts to elicit quality comparisons from LLMs for pairs of TS samples. We then fit a dedicated rating model, termed TSRater, to convert the LLMs' judgments into efficient quality predictions via TSRater's inference on future TS samples. To ensure cross-domain adaptability, we develop a meta-learning scheme to train TSRater on quality comparisons collected from nine distinct domains. To improve training efficiency, we employ signSGD for inner-loop updates, thus circumventing the demanding computation of hypergradients. Extensive experimental results on eleven benchmark datasets across three time series tasks, each using both conventional TS models and TS foundation models, demonstrate that TSRating outperforms baselines in terms of estimation accuracy, efficiency, and domain adaptability.
IRJun 21, 2024
Pistis-RAG: Enhancing Retrieval-Augmented Generation with Human FeedbackYu Bai, Yukai Miao, Li Chen et al.
RAG systems face limitations when semantic relevance alone does not guarantee improved generation quality. This issue becomes particularly evident due to the sensitivity of large language models (LLMs) to the ordering of few-shot prompts, which can affect model performance. To address this challenge, aligning LLM outputs with human preferences using structured feedback, such as options to copy, regenerate, or dislike, offers a promising method for improvement. This feedback is applied to the entire list of inputs rather than giving specific ratings for individual documents, making it a Listwide Labels Learning-to-Rank task. To address this task, we propose Pistis-RAG, a new RAG framework designed with a content-centric approach to better align LLMs with human preferences. Pistis-RAG effectively utilizes human feedback, enhancing content ranking and generation quality. To validate our framework, we use public datasets to simulate human feedback, allowing us to evaluate and refine our method effectively. Experimental results indicate that Pistis-RAG improves alignment with human preferences relative to the baseline RAG system, showing a 6.06% increase in MMLU (English) and a 7.08% increase in C-EVAL (Chinese) accuracy metrics. These results highlight Pistis-RAG's effectiveness in overcoming the limitations associated with traditional RAG approaches.
LGJun 13, 2024
LLM-based Knowledge Pruning for Time Series Data Analytics on Edge-computing DevicesRuibing Jin, Qing Xu, Min Wu et al.
Limited by the scale and diversity of time series data, the neural networks trained on time series data often overfit and show unsatisfacotry performances. In comparison, large language models (LLMs) recently exhibit impressive generalization in diverse fields. Although massive LLM based approaches are proposed for time series tasks, these methods require to load the whole LLM in both training and reference. This high computational demands limit practical applications in resource-constrained settings, like edge-computing and IoT devices. To address this issue, we propose Knowledge Pruning (KP), a novel paradigm for time series learning in this paper. For a specific downstream task, we argue that the world knowledge learned by LLMs is much redundant and only the related knowledge termed as "pertinent knowledge" is useful. Unlike other methods, our KP targets to prune the redundant knowledge and only distill the pertinent knowledge into the target model. This reduces model size and computational costs significantly. Additionally, different from existing LLM based approaches, our KP does not require to load the LLM in the process of training and testing, further easing computational burdens. With our proposed KP, a lightweight network can effectively learn the pertinent knowledge, achieving satisfactory performances with a low computation cost. To verify the effectiveness of our KP, two fundamental tasks on edge-computing devices are investigated in our experiments, where eight diverse environments or benchmarks with different networks are used to verify the generalization of our KP. Through experiments, our KP demonstrates effective learning of pertinent knowledge, achieving notable performance improvements in regression (19.7% on average) and classification (up to 13.7%) tasks, showcasing state-of-the-art results.
CLJun 8, 2024
Fighting Against the Repetitive Training and Sample Dependency Problem in Few-shot Named Entity RecognitionChang Tian, Wenpeng Yin, Dan Li et al.
Few-shot named entity recognition (NER) systems recognize entities using a few labeled training examples. The general pipeline consists of a span detector to identify entity spans in text and an entity-type classifier to assign types to entities. Current span detectors rely on extensive manual labeling to guide training. Almost every span detector requires initial training on basic span features followed by adaptation to task-specific features. This process leads to repetitive training of the basic span features among span detectors. Additionally, metric-based entity-type classifiers, such as prototypical networks, typically employ a specific metric that gauges the distance between the query sample and entity-type referents, ultimately assigning the most probable entity type to the query sample. However, these classifiers encounter the sample dependency problem, primarily stemming from the limited samples available for each entity-type referent. To address these challenges, we proposed an improved few-shot NER pipeline. First, we introduce a steppingstone span detector that is pre-trained on open-domain Wikipedia data. It can be used to initialize the pipeline span detector to reduce the repetitive training of basic features. Second, we leverage a large language model (LLM) to set reliable entity-type referents, eliminating reliance on few-shot samples of each type. Our model exhibits superior performance with fewer training steps and human-labeled data compared with baselines, as demonstrated through extensive experiments on various datasets. Particularly in fine-grained few-shot NER settings, our model outperforms strong baselines, including ChatGPT. We will publicly release the code, datasets, LLM outputs, and model checkpoints.
CVFeb 13, 2022
Improve Deep Image Inpainting by Emphasizing the Complexity of Missing RegionsYufeng Wang, Dan Li, Cong Xu et al.
Deep image inpainting research mainly focuses on constructing various neural network architectures or imposing novel optimization objectives. However, on the one hand, building a state-of-the-art deep inpainting model is an extremely complex task, and on the other hand, the resulting performance gains are sometimes very limited. We believe that besides the frameworks of inpainting models, lightweight traditional image processing techniques, which are often overlooked, can actually be helpful to these deep models. In this paper, we enhance the deep image inpainting models with the help of classical image complexity metrics. A knowledge-assisted index composed of missingness complexity and forward loss is presented to guide the batch selection in the training procedure. This index helps find samples that are more conducive to optimization in each iteration and ultimately boost the overall inpainting performance. The proposed approach is simple and can be plugged into many deep inpainting models by changing only a few lines of code. We experimentally demonstrate the improvements for several recently developed image inpainting models on various datasets.
LGDec 16, 2021
BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and PreprocessingTianfeng Liu, Yangrui Chen, Dan Li et al.
Graph neural networks (GNNs) have extended the success of deep neural networks (DNNs) to non-Euclidean graph data, achieving ground-breaking performance on various tasks such as node classification and graph property prediction. Nonetheless, existing systems are inefficient to train large graphs with billions of nodes and edges with GPUs. The main bottlenecks are the process of preparing data for GPUs - subgraph sampling and feature retrieving. This paper proposes BGL, a distributed GNN training system designed to address the bottlenecks with a few key ideas. First, we propose a dynamic cache engine to minimize feature retrieving traffic. By a co-design of caching policy and the order of sampling, we find a sweet spot of low overhead and high cache hit ratio. Second, we improve the graph partition algorithm to reduce cross-partition communication during subgraph sampling. Finally, careful resource isolation reduces contention between different data preprocessing stages. Extensive experiments on various GNN models and large graph datasets show that BGL significantly outperforms existing GNN training systems by 20.68x on average.