Luoxiao Yang

h-index11

5papers

29citations

Novelty71%

AI Score50

Ranked #20,498 of 194,257 authors (top 11%)#5,038 in LG (top 13%)

5 Papers

14.2LGJul 10, 2024Code

ViTime: Foundation Model for Time Series Forecasting Powered by Vision Intelligence

Luoxiao Yang, Yun Wang, Xinqi Fan et al.

Time series forecasting (TSF) possesses great practical values in various fields, including power and energy, transportation, etc. TSF methods have been studied based on knowledge from classical statistics to modern deep learning. Yet, all of them were developed based on one fundamental concept, the numerical data fitting. Thus, the models developed have long been known to be problem-specific and lacking application generalizability. Practitioners expect a TSF foundation model that serves TSF tasks in different applications. The central question is then how to develop such a TSF foundation model. This paper offers one pioneering study in the TSF foundation model development method and proposes a vision intelligence-powered framework, ViTime, for the first time. ViTime fundamentally shifts TSF from numerical fitting to operations based on a binary image-based time series metric space and naturally supports both point and probabilistic forecasting. We also provide rigorous theoretical analyses of ViTime, including quantization-induced system error bounds and principled strategies for optimal parameter selection. Furthermore, we propose RealTS, an innovative synthesis algorithm generating diverse and realistic training samples, effectively enriching the training data and significantly enhancing model generalizability. Extensive experiments demonstrate ViTime's state-of-the-art performance. In zero-shot scenarios, ViTime outperforms TimesFM by 9-15\%. With just 10\% fine-tuning data, ViTime surpasses both leading foundation models and fully-supervised benchmarks, a gap that widens with 100\% fine-tuning. ViTime also exhibits exceptional robustness, effectively handling missing data and outperforming TimesFM by 20-30\% under various data perturbations, validating the power of its visual space data operation paradigm.

3.8LGFeb 28, 2023Code

Your time series is worth a binary image: machine vision assisted deep framework for time series forecasting

Luoxiao Yang, Xinqi Fan, Zijun Zhang

Time series forecasting (TSF) has been a challenging research area, and various models have been developed to address this task. However, almost all these models are trained with numerical time series data, which is not as effectively processed by the neural system as visual information. To address this challenge, this paper proposes a novel machine vision assisted deep time series analysis (MV-DTSA) framework. The MV-DTSA framework operates by analyzing time series data in a novel binary machine vision time series metric space, which includes a mapping and an inverse mapping function from the numerical time series space to the binary machine vision space, and a deep machine vision model designed to address the TSF task in the binary space. A comprehensive computational analysis demonstrates that the proposed MV-DTSA framework outperforms state-of-the-art deep TSF models, without requiring sophisticated data decomposition or model customization. The code for our framework is accessible at https://github.com/IkeYang/ machine-vision-assisted-deep-time-series-analysis-MV-DTSA-.

1.8LGMar 24, 2022

Rubik's Cube Operator: A Plug And Play Permutation Module for Better Arranging High Dimensional Industrial Data in Deep Convolutional Processes

Luoxiao Yang, Zhong Zheng, Zijun Zhang

The convolutional neural network (CNN) has been widely applied to process the industrial data based tensor input, which integrates data records of distributed industrial systems from the spatial, temporal, and system dynamics aspects. However, unlike images, information in the industrial data based tensor is not necessarily spatially ordered. Thus, directly applying CNN is ineffective. To tackle such issue, we propose a plug and play module, the Rubik's Cube Operator (RCO), to adaptively permutate the data organization of the industrial data based tensor to an optimal or suboptimal order of attributes before being processed by CNNs, which can be updated with subsequent CNNs together via the gradient-based optimizer. The proposed RCO maintains K binary and right stochastic permutation matrices to permutate attributes of K axes of the input industrial data based tensor. A novel learning process is proposed to enable learning permutation matrices from data, where the Gumbel-Softmax is employed to reparameterize elements of permutation matrices, and the soft regularization loss is proposed and added to the task-specific loss to ensure the feature diversity of the permuted data. We verify the effectiveness of the proposed RCO via considering two representative learning tasks processing industrial data via CNNs, the wind power prediction (WPP) and the wind speed prediction (WSP) from the renewable energy domain. Computational experiments are conducted based on four datasets collected from different wind farms and the results demonstrate that the proposed RCO can improve the performance of CNN based networks significantly.

9.1AIApr 13

Diffusion-CAM: Faithful Visual Explanations for dMLLMs

Haomin Zuo, Yidi Li, Luoxiao Yang et al.

While diffusion Multimodal Large Language Models (dMLLMs) have recently achieved remarkable strides in multimodal generation, the development of interpretability mechanisms has lagged behind their architectural evolution. Unlike traditional autoregressive models that produce sequential activations, diffusion-based architectures generate tokens via parallel denoising, resulting in smooth, distributed activation patterns across the entire sequence. Consequently, existing Class Activation Mapping (CAM) methods, which are tailored for local, sequential dependencies, are ill-suited for interpreting these non-autoregressive behaviors. To bridge this gap, we propose Diffusion-CAM, the first interpretability method specifically tailored for dMLLMs. We derive raw activation maps by differentiably probing intermediate representations in the transformer backbone, accordingly capturing both latent features and their class-specific gradients. To address the inherent stochasticity of these raw signals, we incorporate four key modules to resolve spatial ambiguity and mitigate intra-image confounders and redundant token correlations. Extensive experiments demonstrate that Diffusion-CAM significantly outperforms SoTA methods in both localization accuracy and visual fidelity, establishing a new standard for understanding the parallel generation process of diffusion multimodal systems.

1.4CVAug 19, 2021Code

Generative Wind Power Curve Modeling Via Machine Vision: A Self-learning Deep Convolutional Network Based Method

Luoxiao Yang, Long Wang, Zijun Zhang

This paper develops a novel self-training U-net (STU-net) based method for the automated WPC model generation without requiring data pre-processing. The self-training (ST) process of STU-net has two steps. First, different from traditional studies regarding the WPC modeling as a curve fitting problem, in this paper, we renovate the WPC modeling formulation from a machine vision aspect. To develop sufficiently diversified training samples, we synthesize supervisory control and data acquisition (SCADA) data based on a set of S-shape functions depicting WPCs. These synthesized SCADA data and WPC functions are visualized as images and paired as training samples(I_x, I_wpc). A U-net is then developed to approximate the model recovering I_wpc from I_x. The developed U-net is applied into observed SCADA data and can successfully generate the I_wpc. Moreover, we develop a pixel mapping and correction process to derive a mathematical form f_wpc representing I_wpcgenerated previously. The proposed STU-net only needs to train once and does not require any data preprocessing in applications. Numerical experiments based on 76 WTs are conducted to validate the superiority of the proposed method by benchmarking against classical WPC modeling methods. To demonstrate the repeatability of the presented research, we release our code at https://github.com/IkeYang/STU-net.