Yahui Li

h-index8

8papers

6citations

Novelty56%

AI Score45

Ranked #70,503 of 201,326 authors (top 35%)#24,849 in CV (top 42%)

8 Papers

SYAug 24, 2018

Online Static Security Assessment of Power Systems Based on Lasso Algorithm

Yahui Li, Yang Li, Yuanyuan Sun

As one important means of ensuring secure operation in a power system, the contingency selection and ranking methods need to be more rapid and accurate. A novel method-based least absolute shrinkage and selection operator (Lasso) algorithm is proposed in this paper to apply to online static security assessment (OSSA). The assessment is based on a security index, which is applied to select and screen contingencies. Firstly, the multi-step adaptive Lasso (MSA-Lasso) regression algorithm is introduced based on the regression algorithm, whose predictive performance has an advantage. Then, an OSSA module is proposed to evaluate and select contingencies in different load conditions. In addition, the Lasso algorithm is employed to predict the security index of each power system operation state with the consideration of bus voltages and power flows, according to Newton-Raphson load flow (NRLF) analysis in post-contingency states. Finally, the numerical results of applying the proposed approach to the IEEE 14-bus, 118-bus, and 300-bus test systems demonstrate the accuracy and rapidity of OSSA.

NIMar 31

TORCH: Characterizing Invalid Route Filtering via Tunnelled Observation

Renrui Tian, Yahui Li, Xia Yin et al.

To mitigate BGP prefix hijacking, the Resource Public Key Infrastructure (RPKI) provides prefix origin authentication via Route Origin Validation (ROV). Despite extensive measurement efforts in IPv4, the protective impact of ROV in IPv6 has yet to be systematically assessed. Existing approaches suffer from limited observability into invalid route propagation: they often rely on a small set of controlled prefixes or cannot fully profile the filtering of in-the-wild RPKI-invalid routes, which undermines the accuracy of assessment. Furthermore, the inherent opacity of the IPv6 data plane exacerbates the difficulty of performing scalable and reliable active measurements. In this paper, we present TORCH, a novel framework for measuring invalid route filtering in IPv6. It repurposes open 6in4 tunnel endpoints as widely distributed vantage points for global measurement. At its core, we develop a cross-plane inference technique that determines reachability without requiring responsive targets. This method allows us to characterize whether and how traffic is steered to invalid origins across diverse routing scenarios, leading to an in-depth evaluation of the real-world impact of ROV. Our measurements reveal that about 27\% of ASes have achieved nearly full ROV protection. However, several permissive Tier-1 ASes still transit traffic towards invalid origins, maintaining a substantial attack surface. Through a prefix-centric analysis, we provide the first empirical evidence that the collateral damage of same-length prefix filtering can affect a significant fraction of the global Internet. Our findings pinpoint fundamental vulnerabilities in ROV deployment and underscore the urgent necessity for network operators to accelerate RPKI adoption. We make our datasets publicly available.

CVApr 25

EAD-Net: Emotion-Aware Talking Head Generation with Spatial Refinement and Temporal Coherence

Yahui Li, Yinfeng Yu, Liejun Wang et al.

Emotionally talking head video generation aims to generate expressive portrait videos with accurate lip synchronization and emotional facial expressions. Current methods rely on simple emotional labels, leading to insufficient semantic information. While introducing high-level semantics enhances expressiveness, it easily causes lip-sync degradation. Furthermore, mainstream generation methods struggle to balance computational efficiency and global motion awareness in long videos and suffer from poor temporal coherence. Therefore, we propose an \textbf{E}motion-\textbf{A}ware \textbf{D}iffusion model-based \textbf{Net}work, called \textbf{EAD-Net}. We introduce SyncNet supervision and Temporal Representation Alignment (TREPA) to mitigate lip-sync degradation caused by multi-modal fusion. To model complex spatio-temporal dependencies in long video sequences, we propose a Spatio-Temporal Directional Attention (STDA) mechanism that captures global motion patterns through strip attention. Additionally, we design a Temporal Frame graph Reasoning Module (TFRM) to explicitly model temporal coherence between video frames through graph structure learning. To enhance emotional semantic control, a large language model is employed to extract textual descriptions from real videos, serving as high-level semantic guidance. Experiments on the HDTF and MEAD datasets demonstrate that our method outperforms existing methods in terms of lip-sync accuracy, temporal consistency, and emotional accuracy.

CEApr 19, 2024

A Generative Approach to Credit Prediction with Learnable Prompts for Multi-scale Temporal Representation Learning

Yu Lei, Zixuan Wang, Yiqing Feng et al.

Recent industrial credit scoring models remain heavily reliant on manually tuned statistical learning methods. Despite their potential, deep learning architectures have struggled to consistently outperform traditional statistical models in industrial credit scoring, largely due to the complexity of heterogeneous financial data and the challenge of modeling evolving creditworthiness. To bridge this gap, we introduce FinLangNet, a novel framework that reformulates credit scoring as a multi-scale sequential learning problem. FinLangNet processes heterogeneous financial data through a dual-module architecture that combines tabular feature extraction with temporal sequence modeling, generating probability distributions of users' future financial behaviors across multiple time horizons. A key innovation is our dual-prompt mechanism within the sequential module, which introduces learnable prompts operating at both feature-level granularity for capturing fine-grained temporal patterns and user-level granularity for aggregating holistic risk profiles. In extensive evaluations, FinLangNet significantly outperforms a production XGBoost system, achieving a 7.2% improvement in the KS metric and a 9.9% relative reduction in bad debt rate. Its effectiveness as a general-purpose sequential learning framework is further validated through state-of-the-art performance on the public UEA time series classification benchmark. The system has been successfully deployed on DiDi's international finance platform, serving leading financial credit companies in Latin America.

FLU-DYNOct 29, 2025

Conditional neural field for spatial dimension reduction of turbulence data: a comparison study

Junyi Guo, Pan Du, Xiantao Fan et al.

We investigate conditional neural fields (CNFs), mesh-agnostic, coordinate-based decoders conditioned on a low-dimensional latent, for spatial dimensionality reduction of turbulent flows. CNFs are benchmarked against Proper Orthogonal Decomposition and a convolutional autoencoder within a unified encoding-decoding framework and a common evaluation protocol that explicitly separates in-range (interpolative) from out-of-range (strict extrapolative) testing beyond the training horizon, with identical preprocessing, metrics, and fixed splits across all baselines. We examine three conditioning mechanisms: (i) activation-only modulation (often termed FiLM), (ii) low-rank weight and bias modulation (termed FP), and (iii) last-layer inner-product coupling, and introduce a novel domain-decomposed CNF that localizes complexities. Across representative turbulence datasets (WMLES channel inflow, DNS channel inflow, and wall pressure fluctuations over turbulent boundary layers), CNF-FP achieves the lowest training and in-range testing errors, while CNF-FiLM generalizes best for out-of-range scenarios once moderate latent capacity is available. Domain decomposition significantly improves out-of-range accuracy, especially for the more demanding datasets. The study provides a rigorous, physics-aware basis for selecting conditioning, capacity, and domain decomposition when using CNFs for turbulence compression and reconstruction.

CVApr 2, 2025

RealityAvatar: Towards Realistic Loose Clothing Modeling in Animatable 3D Gaussian Avatars

Yahui Li, Zhi Zeng, Liming Pang et al.

Modeling animatable human avatars from monocular or multi-view videos has been widely studied, with recent approaches leveraging neural radiance fields (NeRFs) or 3D Gaussian Splatting (3DGS) achieving impressive results in novel-view and novel-pose synthesis. However, existing methods often struggle to accurately capture the dynamics of loose clothing, as they primarily rely on global pose conditioning or static per-frame representations, leading to oversmoothing and temporal inconsistencies in non-rigid regions. To address this, We propose RealityAvatar, an efficient framework for high-fidelity digital human modeling, specifically targeting loosely dressed avatars. Our method leverages 3D Gaussian Splatting to capture complex clothing deformations and motion dynamics while ensuring geometric consistency. By incorporating a motion trend module and a latentbone encoder, we explicitly model pose-dependent deformations and temporal variations in clothing behavior. Extensive experiments on benchmark datasets demonstrate the effectiveness of our approach in capturing fine-grained clothing deformations and motion-driven shape variations. Our method significantly enhances structural fidelity and perceptual quality in dynamic human reconstruction, particularly in non-rigid regions, while achieving better consistency across temporal frames.

IVSep 18, 2024

Hyperspectral Image Classification Based on Faster Residual Multi-branch Spiking Neural Network

Yang Liu, Yahui Li, Rui Li et al.

Convolutional neural network (CNN) performs well in Hyperspectral Image (HSI) classification tasks, but its high energy consumption and complex network structure make it difficult to directly apply it to edge computing devices. At present, spiking neural networks (SNN) have developed rapidly in HSI classification tasks due to their low energy consumption and event driven characteristics. However, it usually requires a longer time step to achieve optimal accuracy. In response to the above problems, this paper builds a spiking neural network (SNN-SWMR) based on the leaky integrate-and-fire (LIF) neuron model for HSI classification tasks. The network uses the spiking width mixed residual (SWMR) module as the basic unit to perform feature extraction operations. The spiking width mixed residual module is composed of spiking mixed convolution (SMC), which can effectively extract spatial-spectral features. Secondly, this paper designs a simple and efficient arcsine approximate derivative (AAD), which solves the non-differentiable problem of spike firing by fitting the Dirac function. Through AAD, we can directly train supervised spike neural networks. Finally, this paper conducts comparative experiments with multiple advanced HSI classification algorithms based on spiking neural networks on six public hyperspectral data sets. Experimental results show that the AAD function has strong robustness and a good fitting effect. Meanwhile, compared with other algorithms, SNN-SWMR requires a time step reduction of about 84%, training time, and testing time reduction of about 63% and 70% at the same accuracy. This study solves the key problem of SNN based HSI classification algorithms, which has important practical significance for promoting the practical application of HSI classification algorithms in edge devices such as spaceborne and airborne devices.

PLOct 13, 2020

Session-layer Attack Traffic Classification by Program Synthesis

Lei Shi, Yahui Li, Rajeev Alur et al.

Writing classification rules to identify malicious network traffic is a time-consuming and error-prone task. Learning-based classification systems automatically extract such rules from positive and negative traffic examples. However, due to limitations in the representation of network traffic and the learning strategy, these systems lack both expressiveness to cover a range of attacks and interpretability in fully describing the attack traffic's structure at the session layer. This paper presents Sharingan system, which uses program synthesis techniques to generate network classification programs at the session layer. Sharingan accepts raw network traces as inputs, and reports potential patterns of the attack traffic in NetQRE, a domain specific language designed for specifying session-layer quantitative properties. Using Sharingan, network operators can better analyze the attack pattern due to the following advantages of Sharingan's learning process: (1) it requires minimal feature engineering, (2) it is amenable to efficient implementation of the learnt classifier, and (3) the synthesized program is easy to decipher and edit. We develop a range of novel optimizations that reduce the synthesis time for large and complex tasks to a matter of minutes. Our experiments show that Sharingan is able to correctly identify attacks from a diverse set of network attack traces and generates explainable outputs, while achieving accuracy comparable to state-of-the-art learning-based intrusion detection systems.