Yuwen Li

LG
h-index13
19papers
112citations
Novelty46%
AI Score53

19 Papers

54.3NAMay 6
Superconvergence in finite element method by smoothing

Yuwen Li, Han Shui, Ludmil Zikatanov

This paper develops a smoothing-based postprocessing method for superconvergence in finite element methods. The method applies a few smoothing iterations, such as damped Jacobi, Gauss-Seidel, or conjugate gradient, with initial guess being the current finite element solution embedded in an enriched finite element space. The resulting procedure is algebraic, easy to implement, and applicable to high-order and three-dimensional discretizations. For symmetric and positive-definite problems, we prove superconvergence of the smoothed solutions under additive and multiplicative smoothers. Effectiveness of the proposed method is demonstrated by numerical experiments for the Poisson, Maxwell, biharmonic and Helmholtz equations.

SPSep 19, 2022
A Causal Intervention Scheme for Semantic Segmentation of Quasi-periodic Cardiovascular Signals

Xingyao Wang, Yuwen Li, Hongxiang Gao et al.

Precise segmentation is a vital first step to analyze semantic information of cardiac cycle and capture anomaly with cardiovascular signals. However, in the field of deep semantic segmentation, inference is often unilaterally confounded by the individual attribute of data. Towards cardiovascular signals, quasi-periodicity is the essential characteristic to be learned, regarded as the synthesize of the attributes of morphology (Am) and rhythm (Ar). Our key insight is to suppress the over-dependence on Am or Ar while the generation process of deep representations. To address this issue, we establish a structural causal model as the foundation to customize the intervention approaches on Am and Ar, respectively. In this paper, we propose contrastive causal intervention (CCI) to form a novel training paradigm under a frame-level contrastive framework. The intervention can eliminate the implicit statistical bias brought by the single attribute and lead to more objective representations. We conduct comprehensive experiments with the controlled condition for QRS location and heart sound segmentation. The final results indicate that our approach can evidently improve the performance by up to 0.41% for QRS location and 2.73% for heart sound segmentation. The efficiency of the proposed method is generalized to multiple databases and noisy signals.

NANov 30, 2018
Convergence and Optimality of Adaptive Methods for Poisson's Equation in the FEEC Framework

Michael Holst, Yuwen Li, Adam Mihalik et al.

Finite Element Exterior Calculus (FEEC) was developed by Arnold, Falk, Winther and others over the last decade to exploit the observation that mixed variational problems can be posed on a Hilbert complex, and Galerkin-type mixed methods can then be obtained by solving finite-dimensional subcomplex problems. Chen, Holst, and Xu (Math. Comp. 78 (2009) 35-53) established convergence and optimality of an adaptive mixed finite element method using Raviart-Thomas or Brezzi-Douglas-Marini elements for Poisson's equation on contractible domains in two dimensions, which can be viewed as a boundary problem on the de Rham complex. Recently Demlow and Hirani (Found. Math. Comput. 14 (2014) 1337-1371) developed fundamental tools for a posteriori analysis on the de Rham complex. In this paper, we use tools in FEEC to construct convergence and complexity results on domains with general topology and spatial dimension. In particular, we construct a reliable and efficient error estimator and a sharper quasi-orthogonality result using a novel technique. Without marking for data oscillation, our adaptive method is a contraction with respect to a total error incorporating the error estimator and data oscillation.

NAApr 17, 2018
Global superconvergence of the lowest order mixed finite element on mildly structured meshes

Yuwen Li

In this paper, we develop global superconvergence estimates for the lowest order Raviart--Thomas mixed finite element method for second order elliptic equations with general boundary conditions on triangular meshes, where most pairs of adjacent triangles form approximate parallelograms. In particular, we prove the $L^{2}$-distance between the numerical solution and canonical interpolant for the vector variable is of order $1+ρ$, where $ρ\in(0,1]$ is dependent on the mesh structure. By a cheap local postprocessing operator $G_{h}$, we prove the $L^{2}$-distance between the exact solution and the postprocessed numerical solution for the vector variable is of order $1+ρ$. As a byproduct, we also obtain the superconvergence estimate for Crouzeix--Raviart nonconforming finite elements on triangular meshes of the same type.

CVSep 16, 2022
Single Image Deraining via Rain-Steaks Aware Deep Convolutional Neural Network

Chaobing Zheng, Yuwen Li, Shiqian Wu

It is challenging to remove rain-steaks from a single rainy image because the rain steaks are spatially varying in the rainy image. This problem is studied in this paper by combining conventional image processing techniques and deep learning based techniques. An improved weighted guided image filter (iWGIF) is proposed to extract high frequency information from a rainy image. The high frequency information mainly includes rain steaks and noise, and it can guide the rain steaks aware deep convolutional neural network (RSADCNN) to pay more attention to rain steaks. The efficiency and explain-ability of RSADNN are improved. Experiments show that the proposed algorithm significantly outperforms state-of-the-art methods on both synthetic and real-world images in terms of both qualitative and quantitative measures. It is useful for autonomous navigation in raining conditions.

83.8AIMar 17
IQuest-Coder-V1 Technical Report

Jian Yang, Wei Zhang, Shawn Guo et al.

In this report, we introduce the IQuest-Coder-V1 series-(7B/14B/40B/40B-Loop), a new family of code large language models (LLMs). Moving beyond static code representations, we propose the code-flow multi-stage training paradigm, which captures the dynamic evolution of software logic through different phases of the pipeline. Our models are developed through the evolutionary pipeline, starting with the initial pre-training consisting of code facts, repository, and completion data. Following that, we implement a specialized mid-training stage that integrates reasoning and agentic trajectories in 32k-context and repository-scale in 128k-context to forge deep logical foundations. The models are then finalized with post-training of specialized coding capabilities, which is bifurcated into two specialized paths: the thinking path (utilizing reasoning-driven RL) and the instruct path (optimized for general assistance). IQuest-Coder-V1 achieves state-of-the-art performance among competitive models across critical dimensions of code intelligence: agentic software engineering, competitive programming, and complex tool use. To address deployment constraints, the IQuest-Coder-V1-Loop variant introduces a recurrent mechanism designed to optimize the trade-off between model capacity and deployment footprint, offering an architecturally enhanced path for efficacy-efficiency trade-off. We believe the release of the IQuest-Coder-V1 series, including the complete white-box chain of checkpoints from pre-training bases to the final thinking and instruction models, will advance research in autonomous code intelligence and real-world agentic systems.

CRJun 13, 2025Code
Investigating Vulnerabilities and Defenses Against Audio-Visual Attacks: A Comprehensive Survey Emphasizing Multimodal Models

Jinming Wen, Xinyi Wu, Shuai Zhao et al.

Multimodal large language models (MLLMs), which bridge the gap between audio-visual and natural language processing, achieve state-of-the-art performance on several audio-visual tasks. Despite the superior performance of MLLMs, the scarcity of high-quality audio-visual training data and computational resources necessitates the utilization of third-party data and open-source MLLMs, a trend that is increasingly observed in contemporary research. This prosperity masks significant security risks. Empirical studies demonstrate that the latest MLLMs can be manipulated to produce malicious or harmful content. This manipulation is facilitated exclusively through instructions or inputs, including adversarial perturbations and malevolent queries, effectively bypassing the internal security mechanisms embedded within the models. To gain a deeper comprehension of the inherent security vulnerabilities associated with audio-visual-based multimodal models, a series of surveys investigates various types of attacks, including adversarial and backdoor attacks. While existing surveys on audio-visual attacks provide a comprehensive overview, they are limited to specific types of attacks, which lack a unified review of various types of attacks. To address this issue and gain insights into the latest trends in the field, this paper presents a comprehensive and systematic review of audio-visual attacks, which include adversarial attacks, backdoor attacks, and jailbreak attacks. Furthermore, this paper also reviews various types of attacks in the latest audio-visual-based MLLMs, a dimension notably absent in existing surveys. Drawing upon comprehensive insights from a substantial review, this paper delineates both challenges and emergent trends for future research on audio-visual attacks and defense.

37.6NAMay 6
An Adaptive Finite Element Method Based on Generalized Barycentric Coordinates

Yihui Zhou, Yuwen Li

This work derives a posteriori error estimate of polygonal finite element methods based on Wachspress barycentric coordinates. In particular, we prove that the classical residual-based a posteriori error estimator is both an upper and lower bounds for the discretization error. The analysis relies a Scott-Zhang type interpolation and homogeneity arguments for rational functions on polygonal elements. Numerical experiments on square and L-shaped domains demonstrate the effectiveness of the adaptive algorithm.

29.4LGMay 1
PEACE: Cross-modal Enhanced Pediatric-Adult ECG Alignment for Robust Pediatric Diagnosis

Xinran Liu, Yuwen Li, Hongxiang Gao et al.

Automated pediatric electrocardiogram (ECG) diagnosis remains challenging because models trained predominantly on adult data suffer from substantial cross-population mismatch, while pediatric labels are often scarce. We present PEACE (Pediatric-Adult ECG Alignment via Cross-modal Enhancement), a structured cross-modal alignment framework for adult-to-pediatric ECG transfer. PEACE integrates tri-axial clinical semantic decomposition, label-query feature extraction, and curriculum-gated optimization to align transferable adult ECG representations with pediatric diagnostic targets. Since ZZU-pECG provides no paired clinical reports, we generate label-conditioned semantic descriptors using Gemini with concise clinical prompts and use them only as auxiliary training supervision; inference remains ECG-only. On ZZU-pECG, PEACE achieves 59.39%, 79.03%, and 90.89% AUC under zero-shot, 50-shot, and full fine-tuning settings, respectively, and reaches 96.65% AUC on the shared PTB-XL label space. These results suggest that structured clinical semantic supervision can improve low-resource adult-to-pediatric ECG transfer, while prospective clinical validation and more explicit age-aware modeling remain necessary before real-world deployment.

CLDec 29, 2025
Close the Loop: Synthesizing Infinite Tool-Use Data via Multi-Agent Role-Playing

Yuwen Li, Wei Zhang, Zelong Huang et al.

Enabling Large Language Models (LLMs) to reliably invoke external tools remains a critical bottleneck for autonomous agents. Existing approaches suffer from three fundamental challenges: expensive human annotation for high-quality trajectories, poor generalization to unseen tools, and quality ceilings inherent in single-model synthesis that perpetuate biases and coverage gaps. We introduce InfTool, a fully autonomous framework that breaks these barriers through self-evolving multi-agent synthesis. Given only raw API specifications, InfTool orchestrates three collaborative agents (User Simulator, Tool-Calling Assistant, and MCP Server) to generate diverse, verified trajectories spanning single-turn calls to complex multi-step workflows. The framework establishes a closed loop: synthesized data trains the model via Group Relative Policy Optimization (GRPO) with gated rewards, the improved model generates higher-quality data targeting capability gaps, and this cycle iterates without human intervention. Experiments on the Berkeley Function-Calling Leaderboard (BFCL) demonstrate that InfTool transforms a base 32B model from 19.8% to 70.9% accuracy (+258%), surpassing models 10x larger and rivaling Claude-Opus, and entirely from synthetic data without human annotation.

LGJan 20, 2025
Higher Order Approximation Rates for ReLU CNNs in Korobov Spaces

Yuwen Li, Guozhi Zhang

This paper investigates the $L_p$ approximation error for higher order Korobov functions using deep convolutional neural networks (CNNs) with ReLU activation. For target functions having a mixed derivative of order m+1 in each direction, we improve classical approximation rate of second order to (m+1)-th order (modulo a logarithmic factor) in terms of the depth of CNNs. The key ingredient in our analysis is approximate representation of high-order sparse grid basis functions by CNNs. The results suggest that higher order expressivity of CNNs does not severely suffer from the curse of dimensionality.

LGJul 14, 2025
Some Super-approximation Rates of ReLU Neural Networks for Korobov Functions

Yuwen Li, Guozhi Zhang

This paper examines the $L_p$ and $W^1_p$ norm approximation errors of ReLU neural networks for Korobov functions. In terms of network width and depth, we derive nearly optimal super-approximation error bounds of order $2m$ in the $L_p$ norm and order $2m-2$ in the $W^1_p$ norm, for target functions with $L_p$ mixed derivative of order $m$ in each direction. The analysis leverages sparse grid finite elements and the bit extraction technique. Our results improve upon classical lowest order $L_\infty$ and $H^1$ norm error bounds and demonstrate that the expressivity of neural networks is largely unaffected by the curse of dimensionality.

SPJun 25, 2025
Masked Autoencoders that Feel the Heart: Unveiling Simplicity Bias for ECG Analyses

He-Yang Xu, Hongxiang Gao, Yuwen Li et al.

The diagnostic value of electrocardiogram (ECG) lies in its dynamic characteristics, ranging from rhythm fluctuations to subtle waveform deformations that evolve across time and frequency domains. However, supervised ECG models tend to overfit dominant and repetitive patterns, overlooking fine-grained but clinically critical cues, a phenomenon known as Simplicity Bias (SB), where models favor easily learnable signals over subtle but informative ones. In this work, we first empirically demonstrate the presence of SB in ECG analyses and its negative impact on diagnostic performance, while simultaneously discovering that self-supervised learning (SSL) can alleviate it, providing a promising direction for tackling the bias. Following the SSL paradigm, we propose a novel method comprising two key components: 1) Temporal-Frequency aware Filters to capture temporal-frequency features reflecting the dynamic characteristics of ECG signals, and 2) building on this, Multi-Grained Prototype Reconstruction for coarse and fine representation learning across dual domains, further mitigating SB. To advance SSL in ECG analyses, we curate a large-scale multi-site ECG dataset with 1.53 million recordings from over 300 clinical centers. Experiments on three downstream tasks across six ECG datasets demonstrate that our method effectively reduces SB and achieves state-of-the-art performance.

LGMay 30, 2023
GraphCleaner: Detecting Mislabelled Samples in Popular Graph Learning Benchmarks

Yuwen Li, Miao Xiong, Bryan Hooi

Label errors have been found to be prevalent in popular text, vision, and audio datasets, which heavily influence the safe development and evaluation of machine learning algorithms. Despite increasing efforts towards improving the quality of generic data types, such as images and texts, the problem of mislabel detection in graph data remains underexplored. To bridge the gap, we explore mislabelling issues in popular real-world graph datasets and propose GraphCleaner, a post-hoc method to detect and correct these mislabelled nodes in graph datasets. GraphCleaner combines the novel ideas of 1) Synthetic Mislabel Dataset Generation, which seeks to generate realistic mislabels; and 2) Neighborhood-Aware Mislabel Detection, where neighborhood dependency is exploited in both labels and base classifier predictions. Empirical evaluations on 6 datasets and 6 experimental settings demonstrate that GraphCleaner outperforms the closest baseline, with an average improvement of 0.14 in F1 score, and 0.16 in MCC. On real-data case studies, GraphCleaner detects real and previously unknown mislabels in popular graph benchmarks: PubMed, Cora, CiteSeer and OGB-arxiv; we find that at least 6.91% of PubMed data is mislabelled or ambiguous, and simply removing these mislabelled data can boost evaluation performance from 86.71% to 89.11%.

CVJan 18, 2022
Adaptive Weighted Guided Image Filtering for Depth Enhancement in Shape-From-Focus

Yuwen Li, Zhengguo Li, Chaobing Zheng et al.

Existing shape from focus (SFF) techniques cannot preserve depth edges and fine structural details from a sequence of multi-focus images. Moreover, noise in the sequence of multi-focus images affects the accuracy of the depth map. In this paper, a novel depth enhancement algorithm for the SFF based on an adaptive weighted guided image filtering (AWGIF) is proposed to address the above issues. The AWGIF is applied to decompose an initial depth map which is estimated by the traditional SFF into a base layer and a detail layer. In order to preserve the edges accurately in the refined depth map, the guidance image is constructed from the multi-focus image sequence, and the coefficient of the AWGIF is utilized to suppress the noise while enhancing the fine depth details. Experiments on real and synthetic objects demonstrate the superiority of the proposed algorithm in terms of anti-noise, and the ability to preserve depth edges and fine structural details compared to existing methods.

CVNov 10, 2021
Single image dehazing via combining the prior knowledge and CNNs

Yuwen Li, Chaobing Zheng, Shiqian Wu et al.

Aiming at the existing single image haze removal algorithms, which are based on prior knowledge and assumptions, subject to many limitations in practical applications, and could suffer from noise and halo amplification. An end-to-end system is proposed in this paper to reduce defects by combining the prior knowledge and deep learning method. The haze image is decomposed into the base layer and detail layers through a weighted guided image filter (WGIF) firstly, and the airlight is estimated from the base layer. Then, the base layer image is passed to the efficient deep convolutional network for estimating the transmission map. To restore object close to the camera completely without amplifying noise in sky or heavily hazy scene, an adaptive strategy is proposed based on the value of the transmission map. If the transmission map of a pixel is small, the base layer of the haze image is used to recover a haze-free image via atmospheric scattering model, finally. Otherwise, the haze image is used. Experiments show that the proposed method achieves superior performance over existing methods.

LGSep 21, 2021
Neural networks with trainable matrix activation functions

Zhengqi Liu, Shuhao Cao, Yuwen Li et al.

The training process of neural networks usually optimize weights and bias parameters of linear transformations, while nonlinear activation functions are pre-specified and fixed. This work develops a systematic approach to constructing matrix-valued activation functions whose entries are generalized from ReLU. The activation is based on matrix-vector multiplications using only scalar multiplications and comparisons. The proposed activation functions depend on parameters that are trained along with the weights and bias vectors. Neural networks based on this approach are simple and efficient and are shown to be robust in numerical experiments.

HCDec 30, 2020
The Challenges of Crowd Workers in Rural and Urban America

Claudia Flores-Saviaga, Yuwen Li, Benjamin V. Hanrahan et al.

Crowd work has the potential of helping the financial recovery of regions traditionally plagued by a lack of economic opportunities, e.g., rural areas. However, we currently have limited information about the challenges facing crowd work-ers from rural and super rural areas as they struggle to make a living through crowd work sites. This paper examines the challenges and advantages of rural and super rural AmazonMechanical Turk (MTurk) crowd workers and contrasts them with those of workers from urban areas. Based on a survey of421 crowd workers from differing geographic regions in theU.S., we identified how across regions, people struggled with being onboarded into crowd work. We uncovered that despite the inequalities and barriers, rural workers tended to be striving more in micro-tasking than their urban counterparts. We also identified cultural traits, relating to time dimension and individualism, that offer us an insight into crowd workers and the necessary qualities for them to succeed on gig platforms. We finish by providing design implications based on our findings to create more inclusive crowd work platforms and tools

SPMay 9, 2020
Temporal-Framing Adaptive Network for Heart Sound Segmentation without Prior Knowledge of State Duration

Xingyao Wang, Chengyu Liu, Yuwen Li et al.

Objective: This paper presents a novel heart sound segmentation algorithm based on Temporal-Framing Adaptive Network (TFAN), including state transition loss and dynamic inference for decoding the most likely state sequence. Methods: In contrast to previous state-of-the-art approaches, the TFAN-based method does not require any knowledge of the state duration of heart sounds and is therefore likely to generalize to non sinus rhythm. The TFAN-based method was trained on 50 recordings randomly chosen from Training set A of the 2016 PhysioNet/Computer in Cardiology Challenge and tested on the other 12 independent training and test databases (2099 recordings and 52180 beats). The databases for segmentation were separated into three levels of increasing difficulty (LEVEL-I, -II and -III) for performance reporting. Results: The TFAN-based method achieved a superior F1 score for all 12 databases except for `Test-B', with an average of 96.7%, compared to 94.6% for the state-of-the-art method. Moreover, the TFAN-based method achieved an overall F1 score of 99.2%, 94.4%, 91.4% on LEVEL-I, -II and -III data respectively, compared to 98.4%, 88.54% and 79.80% for the current state-of-the-art method. Conclusion: The TFAN-based method therefore provides a substantial improvement, particularly for more difficult cases, and on data sets not represented in the public training data. Significance: The proposed method is highly flexible and likely to apply to other non-stationary time series. Further work is required to understand to what extent this approach will provide improved diagnostic performance, although it is logical to assume superior segmentation will lead to improved diagnostics.