LGMay 27
Stage-wise Distortion-Perception Traversal in Zero-shot Inverse Problems with Diffusion ModelsJiawei Zhang, Ziyuan Liu, Leon Yan et al.
The distortion-perception (D-P) tradeoff is a fundamental phenomenon of Bayesian inverse problems, which characterizes the inherent tension between distortion performance and perceptual quality. Enabling flexible traversal of the D-P tradeoff at inference time is crucial for practical applications. Despite the recent success of diffusion models in zero-shot inverse problem solving, efficient and principled strategies for D-P traversal in diffusion-based inverse algorithms remain inadequately characterized. In this paper, we propose a stage-wise framework for realizing D-P traversal using a single diffusion model in zero-shot inverse problems. Our proposed method, termed MAP-RPS, starts with an MAP estimation stage that approximates the MMSE solution and provides a low-distortion initialization, followed by a re-noised posterior sampling stage that progressively improves perceptual quality. We provide theoretical analyses for both stages, establishing the validity and effectiveness of the proposed design. Furthermore, we extend MAP-RPS to the latent space, yielding LMAP-RPS, which enjoys broader applicability by leveraging large-scale pre-trained latent diffusion backbones. Extensive experiments demonstrate that MAP-RPS and LMAP-RPS enable more effective D-P traversal on various tasks, while also exhibiting strong performance as efficient solvers for real-world inverse problems.
LGNov 4, 2024Code
See it, Think it, Sorted: Large Multimodal Models are Few-shot Time Series Anomaly AnalyzersJiaxin Zhuang, Leon Yan, Zhenwei Zhang et al. · tsinghua
Time series anomaly detection (TSAD) is becoming increasingly vital due to the rapid growth of time series data across various sectors. Anomalies in web service data, for example, can signal critical incidents such as system failures or server malfunctions, necessitating timely detection and response. However, most existing TSAD methodologies rely heavily on manual feature engineering or require extensive labeled training data, while also offering limited interpretability. To address these challenges, we introduce a pioneering framework called the Time Series Anomaly Multimodal Analyzer (TAMA), which leverages the power of Large Multimodal Models (LMMs) to enhance both the detection and interpretation of anomalies in time series data. By converting time series into visual formats that LMMs can efficiently process, TAMA leverages few-shot in-context learning capabilities to reduce dependence on extensive labeled datasets. Our methodology is validated through rigorous experimentation on multiple real-world datasets, where TAMA consistently outperforms state-of-the-art methods in TSAD tasks. Additionally, TAMA provides rich, natural language-based semantic analysis, offering deeper insights into the nature of detected anomalies. Furthermore, we contribute one of the first open-source datasets that includes anomaly detection labels, anomaly type labels, and contextual description, facilitating broader exploration and advancement within this critical field. Ultimately, TAMA not only excels in anomaly detection but also provides a comprehensive approach for understanding the underlying causes of anomalies, pushing TSAD forward through innovative methodologies and insights.
CVMar 13, 2025Code
Improving Diffusion-based Inverse Algorithms under Few-Step Constraint via Learnable Linear ExtrapolationJiawei Zhang, Ziyuan Liu, Leon Yan et al.
Diffusion-based inverse algorithms have shown remarkable performance across various inverse problems, yet their reliance on numerous denoising steps incurs high computational costs. While recent developments of fast diffusion ODE solvers offer effective acceleration for diffusion sampling without observations, their application in inverse problems remains limited due to the heterogeneous formulations of inverse algorithms and their prevalent use of approximations and heuristics, which often introduce significant errors that undermine the reliability of analytical solvers. In this work, we begin with an analysis of ODE solvers for inverse problems that reveals a linear combination structure of approximations for the inverse trajectory. Building on this insight, we propose a canonical form that unifies a broad class of diffusion-based inverse algorithms and facilitates the design of more generalizable solvers. Inspired by the linear subspace search strategy, we propose Learnable Linear Extrapolation (LLE), a lightweight approach that universally enhances the performance of any diffusion-based inverse algorithm conforming to our canonical form. LLE optimizes the combination coefficients to refine current predictions using previous estimates, alleviating the sensitivity of analytical solvers for inverse algorithms. Extensive experiments demonstrate consistent improvements of the proposed LLE method across multiple algorithms and tasks, indicating its potential for more efficient solutions and boosted performance of diffusion-based inverse algorithms with limited steps. Codes for reproducing our experiments are available at https://github.com/weigerzan/LLE_inverse_problem.
SYOct 9, 2020
MIMO ILC for Precision SEA robots using Input-weighted Complex-Kernel RegressionLeon Yan, Nathan Banka, Parker Owan et al.
This work improves the positioning precision of lightweight robots with series elastic actuators (SEAs). Lightweight SEA robots, along with low-impedance control, can maneuver without causing damage in uncertain, confined spaces such as inside an aircraft wing during aircraft assembly. Nevertheless, substantial modeling uncertainties in SEA robots reduce the precision achieved by model-based approaches such as inversion-based feedforward. Therefore, this article improves the precision of SEA robots around specified operating points, through a multi-input multi-output (MIMO), iterative learning control (ILC) approach. The main contributions of this article are to (i) introduce an input-weighted complex kernel to estimate local MIMO models using complex Gaussian process regression (c-GPR) (ii) develop Geršgorin-theorem-based conditions on the iteration gains for ensuring ILC convergence to precision within noise-related limits, even with errors in the estimated model; and (iii) demonstrate precision positioning with an experimental SEA robot. Comparative experimental results, with and without ILC, show around 90% improvement in the positioning precision (close to the repeatability limit of the robot) and a 10-times increase in the SEA robot's operating speed with the use of the MIMO ILC.