Yong Wang

h-index49

7papers

677citations

Novelty43%

AI Score40

Ranked #73,863 of 194,257 authors (top 38%)#25,082 in CV (top 42%)

7 Papers

10.5CVJul 15, 2024

GTPT: Group-based Token Pruning Transformer for Efficient Human Pose Estimation

Haonan Wang, Jie Liu, Jie Tang et al.

In recent years, 2D human pose estimation has made significant progress on public benchmarks. However, many of these approaches face challenges of less applicability in the industrial community due to the large number of parametric quantities and computational overhead. Efficient human pose estimation remains a hurdle, especially for whole-body pose estimation with numerous keypoints. While most current methods for efficient human pose estimation primarily rely on CNNs, we propose the Group-based Token Pruning Transformer (GTPT) that fully harnesses the advantages of the Transformer. GTPT alleviates the computational burden by gradually introducing keypoints in a coarse-to-fine manner. It minimizes the computation overhead while ensuring high performance. Besides, GTPT groups keypoint tokens and prunes visual tokens to improve model performance while reducing redundancy. We propose the Multi-Head Group Attention (MHGA) between different groups to achieve global interaction with little computational overhead. We conducted experiments on COCO and COCO-WholeBody. Compared to other methods, the experimental results show that GTPT can achieve higher performance with less computation, especially in whole-body with numerous keypoints.

13.1CVNov 3, 2025Code

MoSa: Motion Generation with Scalable Autoregressive Modeling

Mengyuan Liu, Sheng Yan, Yong Wang et al.

We introduce MoSa, a novel hierarchical motion generation framework for text-driven 3D human motion generation that enhances the Vector Quantization-guided Generative Transformers (VQ-GT) paradigm through a coarse-to-fine scalable generation process. In MoSa, we propose a Multi-scale Token Preservation Strategy (MTPS) integrated into a hierarchical residual vector quantization variational autoencoder (RQ-VAE). MTPS employs interpolation at each hierarchical quantization to effectively retain coarse-to-fine multi-scale tokens. With this, the generative transformer supports Scalable Autoregressive (SAR) modeling, which predicts scale tokens, unlike traditional methods that predict only one token at each step. Consequently, MoSa requires only 10 inference steps, matching the number of RQ-VAE quantization layers. To address potential reconstruction degradation from frequent interpolation, we propose CAQ-VAE, a lightweight yet expressive convolution-attention hybrid VQ-VAE. CAQ-VAE enhances residual block design and incorporates attention mechanisms to better capture global dependencies. Extensive experiments show that MoSa achieves state-of-the-art generation quality and efficiency, outperforming prior methods in both fidelity and speed. On the Motion-X dataset, MoSa achieves an FID of 0.06 (versus MoMask's 0.20) while reducing inference time by 27 percent. Moreover, MoSa generalizes well to downstream tasks such as motion editing, requiring no additional fine-tuning. The code is available at https://mosa-web.github.io/MoSa-web

1.2SYMar 11, 2016

Internal Model Based Active Disturbance Rejection Control

Jinwen Pan, Yong Wang

The basic active disturbance rejection control (BADRC) algorithm with only one order higher extended state observer (ESO) proves to be robust to both internal and external disturbances. An advantage of BADRC is that in many applications it can achieve high disturbance attenuation level without requiring a detailed model of the plant or disturbance. However, this can be regarded as a disadvantage when the disturbance characteristic is known since the BADRC algorithm cannot exploit such information. This paper proposes an internal model based ADRC (IADRC) method, which can take advantage of knowing disturbance characteristic to achieve perfect estimation of the disturbance under some mild assumptions. The effectiveness of the proposed method is validated by comprehensive simulations and comparisons with the BADRC algorithm.

36.9CVJan 26, 2021Code

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation

Pan Zhang, Bo Zhang, Ting Zhang et al.

Self-training is a competitive approach in domain adaptive segmentation, which trains the network with the pseudo labels on the target domain. However inevitably, the pseudo labels are noisy and the target features are dispersed due to the discrepancy between source and target domains. In this paper, we rely on representative prototypes, the feature centroids of classes, to address the two issues for unsupervised domain adaptation. In particular, we take one step further and exploit the feature distances from prototypes that provide richer information than mere prototypes. Specifically, we use it to estimate the likelihood of pseudo labels to facilitate online correction in the course of training. Meanwhile, we align the prototypical assignments based on relative feature distances for two different views of the same target, producing a more compact target feature space. Moreover, we find that distilling the already learned knowledge to a self-supervised pretrained model further boosts the performance. Our method shows tremendous performance advantage over state-of-the-art methods. We will make the code publicly available.

12.1OCMay 14, 2019

Convolutional neural networks with fractional order gradient method

Dian Sheng, Yiheng Wei, Yuquan Chen et al.

This paper proposes a fractional order gradient method for the backward propagation of convolutional neural networks. To overcome the problem that fractional order gradient method cannot converge to real extreme point, a simplified fractional order gradient method is designed based on Caputo's definition. The parameters within layers are updated by the designed gradient method, but the propagations between layers still use integer order gradients, and thus the complicated derivatives of composite functions are avoided and the chain rule will be kept. By connecting every layers in series and adding loss functions, the proposed convolutional neural networks can be trained smoothly according to various tasks. Some practical experiments are carried out in order to demonstrate fast convergence, high accuracy and ability to escape local optimal point at last.

3.1AIFeb 6, 2017

Survey of modern Fault Diagnosis methods in networks

Zi Jian Yang, Yong Wang

With the advent of modern computer networks, fault diagnosis has been a focus of research activity. This paper reviews the history of fault diagnosis in networks and discusses the main methods in information gathering section, information analyzing section and diagnosing and revolving section of fault diagnosis in networks. Emphasis will be placed upon knowledge-based methods with discussing the advantages and shortcomings of the different methods. The survey is concluded with a description of some open problems.

1.2SYMay 3, 2015

A Unified Stability Analysis Approach for a Class of Interconnected System

Yong Wang

From the structural perspective, this paper investigates a new formulation of the concept of input-to-state stability (ISS), and based on this formulation, proposes a new stability analysis approach for a class of interconnected system. The new formulation of ISS is better able to reflect the tendency of the state $x(t)$ tracking the input $u(t)$ and weakens the conservative of the original form. The stability analysis method which transforms the interconnected system into the equivalent cascade form, does not depend on the Lyapunov function, breaks through the limitation of the small-gain theorem and extends the application of ISS. As its applications in three typical kinds of interconnected systems, this method is used to prove the small-gain theorem again and analyzes the stability of a class of interconnected system and the consensus of the multi-agent system (MAS).