20.3LGNov 4, 2024Code
See it, Think it, Sorted: Large Multimodal Models are Few-shot Time Series Anomaly AnalyzersJiaxin Zhuang, Leon Yan, Zhenwei Zhang et al. · tsinghua
Time series anomaly detection (TSAD) is becoming increasingly vital due to the rapid growth of time series data across various sectors. Anomalies in web service data, for example, can signal critical incidents such as system failures or server malfunctions, necessitating timely detection and response. However, most existing TSAD methodologies rely heavily on manual feature engineering or require extensive labeled training data, while also offering limited interpretability. To address these challenges, we introduce a pioneering framework called the Time Series Anomaly Multimodal Analyzer (TAMA), which leverages the power of Large Multimodal Models (LMMs) to enhance both the detection and interpretation of anomalies in time series data. By converting time series into visual formats that LMMs can efficiently process, TAMA leverages few-shot in-context learning capabilities to reduce dependence on extensive labeled datasets. Our methodology is validated through rigorous experimentation on multiple real-world datasets, where TAMA consistently outperforms state-of-the-art methods in TSAD tasks. Additionally, TAMA provides rich, natural language-based semantic analysis, offering deeper insights into the nature of detected anomalies. Furthermore, we contribute one of the first open-source datasets that includes anomaly detection labels, anomaly type labels, and contextual description, facilitating broader exploration and advancement within this critical field. Ultimately, TAMA not only excels in anomaly detection but also provides a comprehensive approach for understanding the underlying causes of anomalies, pushing TSAD forward through innovative methodologies and insights.
1.2NANov 21, 2007
An Improved Procedure for Selecting the Profiles of Perfectly Matched LayersJiawei Zhang
The perfectly matched layers (PMLs), as a boundary termination over an unbounded spatial domain, are widely used in numerical simulations of wave propagation problems. Given a set of discretization parameters, a procedure to select the PML profiles based on minimizing the discrete reflectivity is established for frequency domain simulations. We, by extending the function class and adopting a direct search method, improve the former procedure for traveling waves.
Enhancing Safety in Reinforcement Learning with Human Feedback via Rectified Policy OptimizationXiyue Peng, Hengquan Guo, Jiawei Zhang et al.
Balancing helpfulness and safety (harmlessness) is a critical challenge in aligning large language models (LLMs). Current approaches often decouple these two objectives, training separate preference models for helpfulness and safety, while framing safety as a constraint within a constrained Markov Decision Process (CMDP) framework. This paper identifies a potential issue when using the widely adopted expected safety constraints for LLM safety alignment, termed "safety compensation", where the constraints are satisfied on expectation, but individual prompts may trade off safety, resulting in some responses being overly restrictive while others remain unsafe. To address this issue, we propose Rectified Policy Optimization (RePO), which replaces the expected safety constraint with critical safety constraints imposed on every prompt. At the core of RePO is a policy update mechanism driven by rectified policy gradients, which penalizes the strict safety violation of every prompt, thereby enhancing safety across nearly all prompts. Our experiments demonstrate that RePO outperforms strong baseline methods and significantly enhances LLM safety alignment.
13.1CVSep 2, 2025
TeRA: Rethinking Text-guided Realistic 3D Avatar GenerationYanwen Wang, Yiyu Zhuang, Jiawei Zhang et al.
In this paper, we rethink text-to-avatar generative models by proposing TeRA, a more efficient and effective framework than the previous SDS-based models and general large 3D generative models. Our approach employs a two-stage training strategy for learning a native 3D avatar generative model. Initially, we distill a decoder to derive a structured latent space from a large human reconstruction model. Subsequently, a text-controlled latent diffusion model is trained to generate photorealistic 3D human avatars within this latent space. TeRA enhances the model performance by eliminating slow iterative optimization and enables text-based partial customization through a structured 3D human representation. Experiments have proven our approach's superiority over previous text-to-avatar generative models in subjective and objective evaluation.
7.0OCNov 10, 2020
Distributed Stochastic Consensus Optimization with Momentum for Nonconvex Nonsmooth ProblemsZhiguo Wang, Jiawei Zhang, Tsung-Hui Chang et al.
While many distributed optimization algorithms have been proposed for solving smooth or convex problems over the networks, few of them can handle non-convex and non-smooth problems. Based on a proximal primal-dual approach, this paper presents a new (stochastic) distributed algorithm with Nesterov momentum for accelerated optimization of non-convex and non-smooth problems. Theoretically, we show that the proposed algorithm can achieve an $ε$-stationary solution under a constant step size with $\mathcal{O}(1/ε^2)$ computation complexity and $\mathcal{O}(1/ε)$ communication complexity. When compared to the existing gradient tracking based methods, the proposed algorithm has the same order of computation complexity but lower order of communication complexity. To the best of our knowledge, the presented result is the first stochastic algorithm with the $\mathcal{O}(1/ε)$ communication complexity for non-convex and non-smooth problems. Numerical experiments for a distributed non-convex regression problem and a deep neural network based classification problem are presented to illustrate the effectiveness of the proposed algorithms.