LGDec 30, 2022
Label-Efficient Interactive Time-Series Anomaly DetectionHong Guo, Yujing Wang, Jieyu Zhang et al. · microsoft-research, pku
Time-series anomaly detection is an important task and has been widely applied in the industry. Since manual data annotation is expensive and inefficient, most applications adopt unsupervised anomaly detection methods, but the results are usually sub-optimal and unsatisfactory to end customers. Weak supervision is a promising paradigm for obtaining considerable labels in a low-cost way, which enables the customers to label data by writing heuristic rules rather than annotating each instance individually. However, in the time-series domain, it is hard for people to write reasonable labeling functions as the time-series data is numerically continuous and difficult to be understood. In this paper, we propose a Label-Efficient Interactive Time-Series Anomaly Detection (LEIAD) system, which enables a user to improve the results of unsupervised anomaly detection by performing only a small amount of interactions with the system. To achieve this goal, the system integrates weak supervision and active learning collaboratively while generating labeling functions automatically using only a few labeled data. All of these techniques are complementary and can promote each other in a reinforced manner. We conduct experiments on three time-series anomaly detection datasets, demonstrating that the proposed system is superior to existing solutions in both weak supervision and active learning areas. Also, the system has been tested in a real scenario in industry to show its practicality.
AIMay 20, 2022
Measuring algorithmic interpretability: A human-learning-based framework and the corresponding cognitive complexity scoreJohn P. Lalor, Hong Guo
Algorithmic interpretability is necessary to build trust, ensure fairness, and track accountability. However, there is no existing formal measurement method for algorithmic interpretability. In this work, we build upon programming language theory and cognitive load theory to develop a framework for measuring algorithmic interpretability. The proposed measurement framework reflects the process of a human learning an algorithm. We show that the measurement framework and the resulting cognitive complexity score have the following desirable properties - universality, computability, uniqueness, and monotonicity. We illustrate the measurement framework through a toy example, describe the framework and its conceptual underpinnings, and demonstrate the benefits of the framework, in particular for managers considering tradeoffs when selecting algorithms.
COMP-PHSep 12, 2024
Rapid Parameter Estimation for Extreme Mass Ratio Inspirals Using Machine LearningBo Liang, Hong Guo, Tianyu Zhao et al.
Extreme-mass-ratio inspiral (EMRI) signals pose significant challenges in gravitational wave (GW) astronomy owing to their low-frequency nature and highly complex waveforms, which occupy a high-dimensional parameter space with numerous variables. Given their extended inspiral timescales and low signal-to-noise ratios, EMRI signals warrant prolonged observation periods. Parameter estimation becomes particularly challenging due to non-local parameter degeneracies, arising from multiple local maxima, as well as flat regions and ridges inherent in the likelihood function. These factors lead to exceptionally high time complexity for parameter analysis while employing traditional matched filtering and random sampling methods. To address these challenges, the present study applies machine learning to Bayesian posterior estimation of EMRI signals, leveraging the recently developed flow matching technique based on ODE neural networks. Our approach demonstrates computational efficiency several orders of magnitude faster than the traditional Markov Chain Monte Carlo (MCMC) methods, while preserving the unbiasedness of parameter estimation. We show that machine learning technology has the potential to efficiently handle the vast parameter space, involving up to seventeen parameters, associated with EMRI signals. Furthermore, to our knowledge, this is the first instance of applying machine learning, specifically the Continuous Normalizing Flows (CNFs), to EMRI signal analysis. Our findings highlight the promising potential of machine learning in EMRI waveform analysis, offering new perspectives for the advancement of space-based GW detection and GW astronomy.
47.8LGMar 27
FatigueFormer: Static-Temporal Feature Fusion for Robust sEMG-Based Muscle Fatigue RecognitionTong Zhang, Hong Guo, Shuangzhou Yan et al.
We present FatigueFormer, a semi-end-to-end framework that deliberately combines saliency-guided feature separation with deep temporal modeling to learn interpretable and generalizable muscle fatigue dynamics from surface electromyography (sEMG). Unlike prior approaches that struggle to maintain robustness across varying Maximum Voluntary Contraction (MVC) levels due to signal variability and low SNR, FatigueFormer employs parallel Transformer-based sequence encoders to separately capture static and temporal feature dynamics, fusing their complementary representations to improve performance stability across low- and high-MVC conditions. Evaluated on a self-collected dataset spanning 30 participants across four MVC levels (20-80%), it achieves state-of-the-art accuracy and strong generalization under mild-fatigue conditions. Beyond performance, FatigueFormer enables attention-based visualization of fatigue dynamics, revealing how feature groups and time windows contribute differently across varying MVC levels, offering interpretable insight into fatigue progression.
STR-ELSep 15, 2025
Neural-Quantum-States Impurity Solver for Quantum Embedding ProblemsYinzhanghao Zhou, Tsung-Han Lee, Ao Chen et al.
Neural quantum states (NQS) have emerged as a promising approach to solve second-quantised Hamiltonians, because of their scalability and flexibility. In this work, we design and benchmark an NQS impurity solver for the quantum embedding methods, focusing on the ghost Gutzwiller Approximation (gGA) framework. We introduce a graph transformer-based NQS framework able to represent arbitrarily connected impurity orbitals and develop an error control mechanism to stabilise iterative updates throughout the quantum embedding loops. We validate the accuracy of our approach with benchmark gGA calculations of the Anderson Lattice Model, yielding results in excellent agreement with the exact diagonalisation impurity solver. Finally, our analysis of the computational budget reveals the method's principal bottleneck to be the high-accuracy sampling of physical observables required by the embedding loop, rather than the NQS variational optimisation, directly highlighting the critical need for more efficient inference techniques.
LGMar 2, 2024
Identity information based on human magnetocardiography signalsPengju Zhang, Chenxi Sun, Jianwei Zhang et al.
We have developed an individual identification system based on magnetocardiography (MCG) signals captured using optically pumped magnetometers (OPMs). Our system utilizes pattern recognition to analyze the signals obtained at different positions on the body, by scanning the matrices composed of MCG signals with a 2*2 window. In order to make use of the spatial information of MCG signals, we transform the signals from adjacent small areas into four channels of a dataset. We further transform the data into time-frequency matrices using wavelet transforms and employ a convolutional neural network (CNN) for classification. As a result, our system achieves an accuracy rate of 97.04% in identifying individuals. This finding indicates that the MCG signal holds potential for use in individual identification systems, offering a valuable tool for personalized healthcare management.
CPMay 25, 2023
Market Making with Deep Reinforcement Learning from Limit Order BooksHong Guo, Jianwu Lin, Fanlin Huang
Market making (MM) is an important research topic in quantitative finance, the agent needs to continuously optimize ask and bid quotes to provide liquidity and make profits. The limit order book (LOB) contains information on all active limit orders, which is an essential basis for decision-making. The modeling of evolving, high-dimensional and low signal-to-noise ratio LOB data is a critical challenge. Traditional MM strategy relied on strong assumptions such as price process, order arrival process, etc. Previous reinforcement learning (RL) works handcrafted market features, which is insufficient to represent the market. This paper proposes a RL agent for market making with LOB data. We leverage a neural network with convolutional filters and attention mechanism (Attn-LOB) for feature extraction from LOB. We design a new continuous action space and a hybrid reward function for the MM task. Finally, we conduct comprehensive experiments on latency and interpretability, showing that our agent has good applicability.
MES-HALLFeb 10, 2022
AD-NEGF: An End-to-End Differentiable Quantum Transport Simulator for Sensitivity Analysis and Inverse ProblemsYingzhanghao Zhou, Xiang Chen, Peng Zhang et al.
Since proposed in the 70s, the Non-Equilibrium Green Function (NEGF) method has been recognized as a standard approach to quantum transport simulations. Although it achieves superiority in simulation accuracy, the tremendous computational cost makes it unbearable for high-throughput simulation tasks such as sensitivity analysis, inverse design, etc. In this work, we propose AD-NEGF, to our best knowledge the first end-to-end differentiable NEGF model for quantum transport simulations. We implement the entire numerical process in PyTorch, and design customized backward pass with implicit layer techniques, which provides gradient information at an affordable cost while guaranteeing the correctness of the forward simulation. The proposed model is validated with applications in calculating differential physical quantities, empirical parameter fitting, and doping optimization, which demonstrates its capacity to accelerate the material design process by conducting gradient-based parameter optimization.